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USE OF TRANSLATIONALLY ALTERED RNA TO CONFSR RESISTANCg TO MAiZg 
DWARF MOSAIC VIRUS AND OTHER MONOCOTYLEDONQUS PLANT VtRUSES 

The invention relates generally to the genetic engineering of monocotyledonous 
plants to resist virus infection through the expression of inhibitory transcripts or proteins 
derived from the inhibited virus. In another aspect, the invention relates to the elucidation 
and characterization of the genonnic structure and organization of a maize dwarf mosaic 
virus. 

Plant viruses are a major problem in agriculture and cause significant losses in crop 
yieid each year. In the past, available approaches for combating plant viruses were primarily 
limited to the selection of plant lines which exhibited genetic resistance to virus infection 
and the application of chemicals designed to protect plants from the organisms responsible 
for introducing the virus to the plant (i.e. viral vectors). 

Recently, a number of approaches for combating plant viruses have been developed 
which are based upon the transformation of susceptible plant species with chimeric genes 
which express transcripts or proteins that inhibit viral infection. These approaches include 
genetically engineering plants to express viral coat protein or coat protein transcripts, viral 
replicases in unmodified or modified form, antisense genes or ribozymes targeting viral 
genomic RNA or transcripts, and altered viral transcripts (for a review, see Rtchen, J.H. et 
al., Ann. Rev. Microbiol. 47: 739-763 (1993)). To apply any of these approaches, knowledge 
of the structure and organization of the genome of the target virus is necessary. 

With respect to the expression of altered viral transcripts to confer viral resistance, 
limited success has been reported in dicotyledonous plants through the expression of viral 
coat protein transcripts which have been modified to render them incapable of translation. 
Expression of such "untranslatable* viral transcripts m tobacco has been reported to inhibit 
tobacco etch virus (Undbo, J.A, et al., Moi. Plant-Microbe Int. 5(2): 144-153 (1992); Undbo, 
J.A. et ai., Virology 189: 725-733 (1992); WO 93/17098 to Dougherty, W.G. et al. (Sept. 2. 
1993); Undbo, J.A. et al., The Plant Cell 5: 1749-1759 (1993)). tomato sootted wilt virus 
(Pang, S. et al.. Bio/Technology 11: 819-824 (1993): DeHaan et al., Bio/Technology 10: 
1133-1137 (1992) and potato virus Y (Van der Vlug R.A. et ai., Plant Moi. Biol. 17: 431- 
439(1991). 
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The ability of such untranslatable RNAs to inhibit viral infection does not appear to be 
universal, however. Failure of such altered viral transcripts to inhibit viral infection have 
been reported for tobacco mosaic virus (Powell, P.A. et ai,, Virology 175: 124-130 (1990) 
and zucchini yellow mosaic virus (Fang. G. et al., Mol. Plant-Microbe Int. S(3J: 358-3S7 
(1993), a potyvirus similar to tobacco etch virus* Additional unreported failures may also 
exist, since such negative results are rarely published. 

The most prevalent virus infecting maize in the United States and Europe is maize 
dwarf mosaic virus (MDMV), This virus is classified as a member of a family of plant viruses 
known as the potyviruses. The potyviruses are the largest group of plant viruses and are 
characterized by a long, flexuous rod particle morphology and are non-persistentiy 
transmitted by aphid vectors (see Hollings, M. and Brunt, A., pages 732-807 of "Handbook 
of Plant Virus Infection and Comparative Diagnosis", ed. by E. Kurstak, pub. by 
E!sevier/Nonh Holland Biomedical Press, Amsterdam (1981)). The potyviruses have a 
genome composed of a single strand positive sense messenger RNA molecule which is 
transcribed and translated as one polyprotein that is subsequently cleaved into its 
component parts. The family is composed of many taxonomic strains, with the two most 
common being strains A and B. These strains are differentiated by the ability of MDMV-A to 
infeci johnsongrass which is the overwintering host. MDMV-A is primarily tocaiized to the 
southeastern United States due to the occurrence of johnsongrass in this area. MDMV-B is 
more widespread and can be found in the U.S. corn belt and throughout Europe (i.e. Spain, 
France, and Italy). MDMV-B is the most economically important maize virus due to its 
widespread occurrence. 

Viral diseases of maize result in an estimated 5% annual yield reduction as well as 
reduce overall plant health which results in increased susceptibility to other pathogens. 
Expenmentai trials using common maize inbreds and hybrids have shown yield reductions 
from MDMV as great as 35% in inoculated plots. MDMV is a major crop pest in maize where 
it causes mosaic symptoms and dwarfing of infecting plants, ultimately reducing crop yields 
(Knoke, J.K. et al., pages 235-281 of "Diseases of Cereals & Pulses", volume I, ed. by 
Singh, U.S. et al.. pub. by Prentice Hall, Englewood Cliffs. NJ (1992)). When found in 
combination with maize chiorotic mottle virus (MCMV). a synergistic condition known as com 
lethal necrosis results causing even more severe crop damage (see Uyemoto. J.K., pages 
141-143 of 'Proc, Infl. Maize Virus Disease Colloq. & Workshop", ed. by Gordon. D.T. et 
al., pub. by Ohio State Univ. and Ohio Agnc. Res. Dev. Center. Wooster. MA (1983). 
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The economic impact of yield loss s due to MDMV has generated considerabie 
interest in developing strategies to combat this virus. To date, however, only limited success 
has been achieved in reducing the adverse impact of this virus. Thus there remains a need 
to identify additional effective means for protecting host plants from MDMV. 

Both strains A and B of MDMV are transmitted in nature by aphids in a non-persistent 
manner, thus insect control is not a practical control method. The most effective method of 
control of these diseases is the use of resistant germplasm. In maize, sources of resistant 
gemnplasm exist to both strains of the MDMV, but the efficacy of the resistance is somewhat 
variable and identification of this material can be difficult. Studies have shown that 
resistance to MDMV is not the result of a single, dominant gene, but rather being multigenic 
(2-5 genes). There has been an abundance of research on the development of altemative 
strategies for conferring resistance in transgenic plants. Most of these strategies have 
focused on the expression of viral genes (i.e. the viral coat protein) in plants as a means of 
conferring resistance. The benefits of these strategies are that the resistance can be 
developed to viruses in which effective natural resistance can not be identified and the 
resistance is more easily transferred to agronomically desirable plant lines. The majority of 
this work has focused on coat protein mediated resistance which is based on the 
expression of the viral coat protein in the plant. Coat protein mediated resistance has been 
paniculariy effective for some viruses (e.g. tobacco mosaic virus) but inconsistent for other 
viruses (e.g. potyviruses) when tested in model systems such as tobacco and in 
economically important grain crops such as maize, wheat, and rice. 

More recently, another virus resistance strategy has been developed which conferred 
an immune phenotype in plants transformed with segments of virus sequence. The 
phenomenon has been termed RNA-mediated resistance and is thought to be similar to 
sense suppression or co-suppression described in other plant systems. Specifically, plants 
were transformed with a sequence encoding the virus coat protein which had been modified 
to cause premature termination during translation. The expression of this untranslatable 
viral coat protein sequence at high levels was hypothesized to activate a RNA degradation 
cycle which eliminated the transgene mRNA in a sequence specific manner. The pathway 
was then believed to be capable of also eliminating an infecting virus which contains 
sequence highly homologous (>90%) to the transgene sequence. Since tho original 
description of RNA-mediated resistance (see Undbo, J.A et al., Mol. Plant-Microbe Int. 5(2): 
144-153 (1992) and DeHaan etaL Bio/Technology 10: 1133-1137 (1992)), there have 
been additional descriptions of this fonm of resistance. Furthermore, it has been shown that 
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prior work thought to be resistance due to expr ssion of a viral protein is more likely to be 
RNA-mediated resistance. However, this strategy has not been effective for all viruses (see 
Powell, P.A. et al.. Virology 175: 124-130 (1990) and Fang, et al., Moi. Plant-Microbe Int 
6(3): 358-367 (1993)). The examples of RNA-mediated resistance have been limited to 
model dicot hosts sucn as tobacco and potato. It is not known if this resistance will be 
effective in monocots nor what factors will be necessary for induction of this resistance. 

The genomic structure and organization of MDMV has remained largely 
uncharacterized except for the elucidation of viral coat protein coding sequences (see 
Frenkel, M. J. et al. J. Gen. Virol. 72:237-242, (1991); see also Murray, LE. et al., 
Bio/Technology 11: 1559-1564 (1993)). As a result, it is currently not possible to apply many 
of the more recent recombinant-ONA based approaches that have been used for combating 
plant viruses to MDMV. These approaches require a more extensive understanding of the 
structure and organization of the genome of the target virus than is currently available for 
MDMV, 

in one aspect, the present invention provides a method for protecting a 
monocotyiedonous plant from infection by a virus by producing in such a plant an RNA 
molecule whose sequence corresponds, at least in part, to a mRNA or the plus strand RNA 
produced by the virus. The RNA molecule produced according to the method of the 
invention is modified so that it cannot be translated completely as compared to the viral 
RNA to which it corresponds, included within this aspect of the invention are chimeric 
genes designed to express such modified RNA molecules in monocotyiedonous plants, as 
well as monocotyiedonous plants containing such chimeric genes stably integrated into their 
genome. Such plants and their progeny are protected from infection by monocotyiedonous 
viruses that produce messenger or plus-sense RNA which share sequence identity with the 
modified RNA molecule encoded and expressed by the stably integrated chimeric gene. 

Another aspect of the Invention is based upon structural and organizational 
information that has been elucidated for the genome of strain B of Maize Dwarf Mosaic 
Virus (MDMV-B) upstream of the coat protein gene. Included in this aspect of the invention 
are chimeric genes designed to express coding sequences ior MDMV-B proteins including 
the coat protein (nucleotides 7308-8291 of SEQ ID No. 1). the RNA dependent RNA 
polymerase (RdRp) (nucleotides 5745-7307 of ScQ ID No, l), proteinase (nucleotides 
4452-5744 of SEQ ID No. 1). a 6K protein (nucleotides 4293-4451 of SEQ ID No, 1), 
cylindrical inclusion protein (CIP) (nucleotiaes 2376-4292 of SEQ ID No. 1). P3 proteinase 
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(nucleotides 1 134-2375 of SEQ ID No. 1 ), and a portion of the helper component-P2 
proteinase (HC-Pro) (nucleotides 3-1133 of SEQ ID No. 1). Methods for protecting plants 
from MDMV infection by transforming them wrth these chimeric genes are included within 
this aspect of the invention along with the resulting transgenic plants and their progeny. 

The MDMV-B coding sequences may also be modified according to the first aspect of 
the present invention so that the RNA derived therefrom cannot be property tran^ted. The 
present invention includes chimeric genes designed to express such translattonaily altered 
MDMV-B RNAs in plants. Methods for protecting plants from MDMV infection by 
transforming them with these chimeric genes are included within this aspect of the invention 
along with the resulting transgenic plants and their progeny. 

The following sequences according to the invention are disclosed in the sequence listing: 

SEQ ID No. 1 : Sequence of the polycistronic messenger RNA of maize dwarf mosaic 

virus, strain B. 

SEQ ID NO. 2: Sequence of the polyprotein encoded by the polycistronic messenger RNA 

of maize dwarf mosaic virus, strain B. 
SEQ ID No. 3: Rrst intemal control alcohol dehydrogenase PGR primer used in analysis 

of To plants as described in Example 4. 
SEQ ID No. 4: Second intemal control alcohol dehydrogenase PGR primer used in 

analysis of To plants as described in Example 4. 
SEQ ID No. 5: First PGR primer for the synthetic PAT gene used in analysis of To plants 

as described in Example 4. 
SEQ ID No. 5: Second PGR primer for the synthetic PAT gene used in analysis of To 

plants as described in Example 4. 
SEQ ID No, 7: First PGR primer for the NIa proteinase gene used in analysis of To plants 

as described in Example 4. 
SEQ !D No. 8: Second PGR primer for the NIa proteinase gene used in analysis of T© 

plants as described in Example 4. 

For purposes of describing the present invention, the term "translationally altered 
RNA" is used to refer to a modified form of a naturally occurring messenger RNA sequence 
whicn cannot be completely translated compared to the unmodified, naturally occurring 
form. A translationally altered RNA may be incapable of being translated at all or it may be 
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capabi of being partially translated into an attenuated peptide corresponding to a portion 
of the peptide encoded by the naturally occurring messenger RNA seauence from which the 
translationatiy altered RNA is derived. 

The coding sequence for a naturally occurring viral RNA sequence may be modified to 
encode a transtationaily altered RNA, for example, by removing its ATG initiation codon or 
by utilizing a portion which does not include the initiation codon. Other means for 
translationatiy aitenng a naturally occurring virai RNA molecule include introducing one or 
more premature stop codons and/or interrupting the reading frame. 

The basis for the present invention is two-fold. The first basis for the present 
invention is the discovery that reduced susceptibility to infection by a virus may be conferred 
upon a monocotyledonous plant by producing in the plant a translationatiy altered RNA 
molecule corresponding in sequence to a plus-sense or messenger RNA molecule of the 
target virus. The second basis for the present invention is the elucidation and 
characterization by the inventors of the genomic structure and organization of strain B of 
maize dwarf mosaic virus (MDMV-B). These two bases are addressed consecutively below 
and are both represented by tiie examples demonstrating resistance to MDMV*B via 
expression of a translationatiy altered RNA in a transgenic maize plant. 

The first aspect of the present invention is directed to a general method for reducing 
the susceptibility of a monocotyledonous plant to viral infection by prooucing in the plam a 
transtationaily altered RNA molecule con'esponding to a messenger RNA sequence of the 
target virus. Viruses infecting monocotyleaonous plants will be referred to as 
monocotyiedonous- viruses. A method is provided for protecting progeny of a 
monocotyledoneous parent plant from virai infection by transforming said parent plant with a 
chimeric gene comprising a monocotyledonous plant promoter operably linked to a 
nucleotide sequence derived from the genomic seauence of a virus infecting 
monocoiyiedoneous plants, wherein sard nucleotide sequence contains a modification 
rendering a messenger RNA transcribed from said nucleotide sequence incapable of 
complete translation, and obtaining progeny plants. Alternatively* said progeny of a parent 
plant can be protected from virai infection by breeC g the parent plant with a 
monocotyiedonous plant having an inheritable trait of resistance to infection due to its 
expression of a chimeric gene comprising a monocotyledonous plant promoter operaoiy 
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linked to a nucleotide seauence derived from ttie genomic s quence of a virus infecting 
monocotyiedoneous plants, wh r in said nucteotide sequence contains a modification 
rendering a messenger RNA transcribed from said nucleotide sequence incaoable of 
complete translation 

The preferred approach for producing the transtationally altered RNA molecule in a 
monocotyiedonous plant is by introducing a chimeric gene designed to express this 
molecule into the genome of the plant Such a chimeric gene will consist of a plant 
promoter operably linked to a nucleotide sequence derived from the genomic sequence of a 
virus infecting monocotyiedoneous plants, wherein said nucleotide sequence contains a 
modification rendering a messenger RNA transcribed from said nucleotide sequence 
incapable of complete translation. 

The promoter component may be any monocotytodoneous plant promoter that is any 
promoter which is capable of regulating or directing the expression of an operably linked 
gene in the targeted monocotyiedonous plant. Such promoters are well known in the art. 
Preferably, a promoter which is capable of directing strong constitutive expression is used. 
Such promoters include, but are not limited to, the maize ubiquitin promoter described in 
Tokl et al., Plant Physiol. 100; 1503-1507 (1992) , the maize phosphoenolpyruvate 
carboxylase (PEPC) promoter as described in Hudspeth, R.L et aL, Plant Molec. Biol. 12: 
579-589 (1989), and the CaMV 35S promoter as described in Kay et al.. Science 236: 
1299-1302 (1987). 

The coding sequence component comprises a sequence which, when transcribed, 
produces a translationatly altered RNA molecule corresponding to a target viral sequence. 
The target viral sequence is a messenger RNA (mRNA) molecule of the target virus, or a 
ponion thereof. Since the target viral sequence is naturally translatable when a translation 
initiation ccdon is present, it is modified so as to render it transtationally altered. For any 
given target viral sequence, the skilled artisan will be able to determine various 
modifications which could be made to render the resulting RNA molecule translationally 
altered. 

Translation of an mRNA molecule in a plant cell generally requires the presence of an 
initiation AUG codon followed by an uninterrupted string of amino acid codons (known as 
the reading frame) ending with a translational stop codon. which may be either UAA, UAG 
or UGA. A DNA molecule encoding a translatable mRNA molecule may be modified to 
encode a translationally altered RNA. for instance, by either removing the initiation ATG 
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codon. interrupting the reading frame, adding premature stop codons. or by a combination 
of these modifications. 

Introduction of one or more premature stop codons (encoded by DNA codons TAA, 
TAG or TGA) in a target virai seauence may be accomplished by adding or deleting 
nucleotides or by modifying existing nucleotides using standard techniques such as site 
directed mutagenesis or mutagenesis by PGR. Adding or deleting nucleotides may have 
the additional benefit of interrupting the reading frame, which also has the effect of 
translationally altering the RNA molecule. While the addition of a premature stop codon 
anywhere along the length of the target viral sequence will render it transtationaily altered 
as that term is used herein to describe the invention, it is preferable to introduce such stop 
codons near the 5* end of the target viral mRNA so that any attenuated peptides which may 
be produced via partial translation are 20 amino acids or less in length. 

The reading frame of a target viral sequence may be intenupted by the addition or 
deletion of nucleotides in the ONA coding sequence. As with the addition of premature stop 
codons, it is preferable to intenxipt the reading frame near the 5* end of the target viral RNA 
so that any attenuated peptides corresponding to a portion of the peptide encoded by the 
target viral RNA which may be produced via partial translation are 20 amino acids or less in 
length. 

Another way to transiationaily alter the target viral sequence is to remove the 
translation initiation codon, which will be an ATG. This may be accomplished simply by 
choosing a target viral sequence which does not include the translation initiation codon. 
Alternatively, this may be accomplished by disrupting the ATG codon either by adding, 
deleting or modifying nucleotides within this codon using standard techniques. 

Any messenger RNA molecule produced by the target monocotyiedonous virus, or 
any portion of such* a molecule, may be used as the target viral sequence. The target viral 
sequence is preferably at least 120 nucleotides in length, more preferably at least 250 
nucleotides in length, and most preferably at least 500 nucleotides in length. 

A transiationaily altered viral RNA according to the invention includes any modified 
tonm of a naturally occurnng viral messenger RNA sequence which cannot be completely 
translated as compared to the unmodified, naturally occurring form. Thus a translationally 
altered viral RNA may either be incapable of being translated at all, or ;t may be capable of 
translating an attenuated peptide corresoonding to a potion of the peptide encoded by the 
target virai sequence used as a template. 
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The inhibitory effect of a transiationally altered viral RNA is contemplated to be based, 
at least in part, upon its effect on host cell degradation mechanisms. Production of a 
transiationally altered viral RNA in a plant cell is contemplated to tngger one or more cellular 
RNA degradation mechanisms which target the transiationally altered viral RNA, as well as 
any corresponding homologous unaltered viral RNA molecules which may be present in the 
cell (see, e.g. page 550 of Dougherty, W.G. et al., Moi. Plant-Microbe Int. 7(5): 544-552 
(1994); Chasan, R., The Plant Ceil 6: 1329-1331 (1994)). 

The aoility to translate an attenuated peptide, particularly a short peptide less than 20 
amino acids, is contemplated to enhance the thggehng effect of the transiationally altered 
viral RNA upon host cell RNA degradation pathways contemplated to play a rote in inhibition 
of viral infection. Thus transiationally altered RNAs which are capable of translating an 
attenuated peptide are preferred. More preferably, the transiationally altered viral RNA is 
capable of translating an attenuated peptide less than 20 amino acids in length. For target 
viral RNAs which do not include a translation initiation codon, one may be added in 
conjunction with the addition of a premature stop codon or interruption of the reading frame 
to create a transiationally altered RNA capable of translating an attenuated peptide (see, for 
example, the construct pC!B5018 described in Example 4). 

Target viral sequences may be selected from the group consisting of a potyvirus, 
iuteovirus. tenuiivirus, carmovtrus, machiovirus. geminivirus and reovirus sequences and 
may correspond to the coding sequence for any virai protein, such as a viral coat protein, 
repticase. proteinase, inclusion body protein, heiicase, 5K protein and VPg. Such 
sequences are well known for several monocotyiedonous viruses inciuoing, but not limited 
to. MDMV (see SEQ ID NO. 1), Sugarcane mosaic virus (partial sequence; see Frenkel, M. 
J. et al. J. Gen. Virol. 72:237-242, (1991)). Johnsongrass mosaic virus (partial sequence) 
(see Gough. K. H. et al., J. Gen. Virol. 68:297-304. (1987), maize chiorotic mottle virus (see 
Nutter, R. C. et al. Nucleic Adds Research 17:3163-3177, (1989)). maize chiorotic dwarf 
virus (see WO 94/21796), maize rough dwarf virus (partial sequence) (see Marzachi, C. et 
al. Virology 180:518-526, (1991)), maize stripe virus (partial sequence) (see Huiet, L. et al. 
Virology 182:47-53, (1991); Huiet, L et al. J. Gen. Virol. 73:1603-1607. (1992); Huiet, L et 
al. GenBank Accession Number L3446, (1993)). maize streak virus (see Muliineaux. P. M. 
et al EMBG J. 3:3063-3068, (1984)), barley yellow awart virus (see Larkins. B. A. et al. J. 
Gen. Viroi. 72:2347-2355. (1991)). and wheat spindle streak virus (partial sequence) (see 
Sohn, A. et al. Arcn. Virol. 135:279-292, (1994)). 
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Suitable host plants which may benefit from the produaion of transtationally attered 
viral RNA such as altered MDMV RMA include any monocotyledenous species which are 
susceptible to viral infection, particularly inrection by a member of the potyvirus family. In 
particular, suitable host plants Inlcuoe maize, wheat, sugarcane and sorghuri^. 

in a preferred embodiment^ the target viral sequence used is a coding sequence 
which is identical or highly homologous among two or more monocotyledonous viruses or 
virus strains. Expression of a translationaily altered RNA in a monocotyledonous plant 
based on such a shared sequence is contemplated to inhibit infection by any of the viruses 
which produce a messenger RNA having homology with the target viral sequence. 

A second aspect of the present invention is based upon the elucidation and 
characterization by the inventors of the genomic structure and organization of strain B of 
maize dwarf mosaic virus (MDMV-B). Previously, only the genomic sequence of the MDMV- 
B coat protein was known (see Frenkel, M. J. et al., J. Gen. Virol. 72: 237-242 (1991)). As a 
result of the disclosed invention it is now possible to apply many of the more recent 
recombinant-ONA based approaches that have been used for combating plant viruses to 
MDMV such as the use of chimeric genes comprising a plant promoter operafaty linked to a 
nucleotide sequence deriveo from the genomic seouence of maize dwarf mosaic virus strain 
B encoding a viral protein other than a coat protein, wherein transgenic expression of said 
chimeric genes in a plant inhibits infection of said plant with said vims. 

The MDMV-a positive strand RNA genome is believed to be approximately 10,000 
bases in length based on the length of other potyviruses* The sequence of 8530 
nucteotides beginning at the 3' end of the MDMV-B genome is set forth in SEQ ID NO: 1 . A 
single long open reading frame was identified within this sequence of the viral genome and 
the polyprotein amino acid sequence encoded by this open reading frame is provided in 
SEQ ID NO: 2. With the sequence information provided, this viral genome can be isolated 
and cloned using a variety of standard genetic engineering techniques well known to those 
of skill in the art. Three ONA fragments covering 85% of the MDMV-B genome have been 
cloned into a Biuescript (l SK plasmid backbone (Stratagene). transtormeo and propagated 
in the £. coii cell line HB101, and deposited on June 29. 1995 with the Midwest Area 
National Center for Agricultural Utilization Research (formerly known as the National 
Regional Research Lab and still referred to by the corresponding acronym **NRRL**). One of 
the plasmids designated "1-47" and deposited under the accession No. NRRL B-21479 
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contains nucleotides 3252-8530 of the MDMV-B genome. Another piasmid designated 
24" and deposited under the accession No. NRRL B-21480 contains nucleotides 1866-3317 
of the MDMV-B genome. Yet another piasmid designated **9-1-5" and aeposited under the 
accession No. NRRL B-21481 contains nucleotides 1-2122 of the MDMV-B genome . 

The polyprotein encoded by the MDMV-B genome includes a single coat protein 
designated CP whose coding sequence extends from nucleotide 7308 to 8291 of SEQ ID 
No. 1 and whose amino acid sequence extends from amino acid 2436 to 2763 of SEQ ID 
No. 2. The MDMV-B poiyprptein is also contemplated to include a repiicase protein, three 
proteinases, a 6K protein, a helper component, proteins involved in virai movement in the 
host plant (both cell to celt and long distance transport), a helicase protein and a VPg 
protein. 

MDMV-B is contemplated to contain a serine-like proteinase analogous to sertne-like 
proteinases that have been identified in related potyviruses. These serine-like proteinases 
have a characteristic catalytic domain of three amino acids consisting of a histidine at 
position 1 of the domain, an aspartic acid at the second position, and a cysteine at the third 
(see Bazan, J. F. and Retterick, R. J.. Proc. Natl. Acad. Sci. USA 85: 7872-7876 (1988)). 
These amino acids are separated in the primary amino acid sequence by a region spanning 
approximately 140 amino acids. The intervening sequences between each of the catctlytic 
domain sequences exhibits additional limited homology among the known proteinases (see 
Bazan, J. F. and Hetterick, R. J., Proc. Natl. Acad. Sci. USA 85: 7872-7876 (1988)). 
Based upon comparison with the known proteinase sequences, the MDMV-B proteinase 
catalytic domain is contemplated to span a 105 amino acid sequence from position 1718 to 
1823 of SEQ ID No: 2 with the three catalytic residues occurring at amino acids 1718, 1753. 
and 1823 of SEQ ID No. 2 

MDMV-B is also contemplated to contain a second proteinase analogous to the 
cysteine proteinases that have been identified in related potyviruses. The active-site 
residues form a catalytic diad made up of a consen/ed cysteine and histidine which are 
separated by 72 amino acids (see Oh, C. and Carnngton, J. C, Virology 173:592-699, 
(1989)). This proteinase is located within the carooxy-terminus of the HC-Pro region of the 
potyvlrus polyprotein. Based upon comparison with the known proteinase sequences of 
tobacco etcn virus, the MDMV-B HC-Pro proteinase domain is contemplated to span a 74 
amino acid region from position 263 to 335 of SEQ I*" No: 2 with the two catalytic residues 
occurring at amino acids 263 and 336. 
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The location of the MDMV-B putativ helicase domain can be identified based on the 
homology with other known viral helicase domains (see Gorbaienya, A. E. et al., Nudeic 
Adds Research 17 {12):471 3-4730, (1989)). The helicase domain consists of seven distinct 
highly conserved segments which correspond to the NTP-binding motif. The primary 
consensus site consists of a glycine at position 1 of the motif, glydne at position 3, lysine at 
position 4, and either a serine or threonine at position 5 (see Gorbalenya, A. et ai. supra): 
The conserved helicase domain is located in the MDMV-B genome within a region encoding 
the cylindrical inclusion protein (CIP) and is found from amino adds 880 to 1010 of SEQ ID 
No: 2. The consen/ed domain (GxGDS) is located at amino acids 883, 385, 886, and 887 
of SEQ ID No: 2. 

The coding sequence for the replicase gene of MDMV-B may also be determined by 
the location of conserved motifs common to viral replicase genes and by identification of 
putative viral proteinase cleavage sites bordering the replicase coding sequence. 
Conserved motifs have been found in other viral repiicases. in particular, the conserved 
amino acid motif GDD (known as domain C) is the hallmark consensus sequence for ail 
RNA- dependent reolicases (Poch et al. SMBO 8: 3867-3874 (1989)). This conserved motif 
is found at amino acids 2266-2268 in the MDMV-B open reading frame (SEQ ID No: 2). 
TwQ additional conserved motifs characteristic of a plant viral replicase have been identified 
and designated as domain A and B (Poch et al., supra). Domain A is a 17 amino add 
sequence with two centrally conserved amino acids which are present in the MDMV-B 
genome at amino adds 2163 and 2168 of SEQ ID No: 2. Domain B is a 10 amino add 
sequence consisting of 5 conserved amino adds which are present in the MDMV-B genome 
at amino adds 2222, 2223, 2224, 2225 and 2226 of SEQ ID No: 2. 

The isolated MDMV-B genomic sequences taught by the present invention are 
particulariy useful for the development of viral resistance in susceptible host plants. With 
the information provided by the present invention, several approaches for inhibiting plant 
virus infection in susceptible plant hosts which invoive expressing in such hosts various 
inhibitory transcripts or proteins derived from the target virus genome may now be applied 
to MDMV. 

Use of translationaily altered RNA in a method tor procucng a monocotyledonous 
plant with an inherltaoie trait of resistance to infection by a maize dwarf mosaic virus 
comprising transforming said plant with a chimeric gene comprising a monocotyledonous 
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plant promoter operably linkea to a nucleotide sequence derived from th genomic 
sequence of a maize dwarf mosaic virus, wherein said nucleotide sequence contains a 
modification rendering a messanger RNA transcribed from said nucleotide sequence 
incapable of complete translation, may now be applied to MOMV-B, as demdhstrated by 
Example 4. 

Anotfier approach which may be used to confer plant virus resistance is to express 
the gene of the target virus in the host plant (e.g. WO 94/18336 to Turner at ai. for potato 
leaf roil virus and WO 91/13542 to Zaitiin et al. for tobacco mosaic virus; herein 
incorporated by reference in their entirety). This approach may also be applied to MDMV*B 
using the information provided by the present invention. 

For resistance strategies which depend upon expression of a viral repiicase coding 
sequence in a transgenic plant, a cDNA clone encompassing nucleotides 5745 to 7307 of 
SEQ ID No: 1 , contemplated to include the active domains of the MDMV-B can be used for 
plant transformation. More preferably, such strategies may be employed by transforming a 
plant with larger expressible fragments of the MDMV-B genome contemplated to 
encompass the entire repiicase protein. In this case, the MOMV-B repiicase would be 
cleaved from the encoded polypeptide when exposed to MOMV-B viral proteinase in the 
plant cell. 

The MDMV-B repiicase coding sequence may be engineered for recombinant 
expression in a monocotyledonous host plant which is normally susceptible to infection by 
MDMV-B. Expression of MDMV-B repiicase in such a monocotyledonous host plant is 
contemplated to confer resistance to (i.e. inhibit) MDMV-B infection. 

Suitable host plants which may benefit from application of any of the resistance 
approaches described above include any monocotyledonous species which are susceptible 
to infection by MDMV-B. In particular, suitable host plants are contemplated to include 
maize, sorghum and sugarcane. 

To express inhibitory transcripts or proteins derived from the MDMV-B genome in a 
host plant ceil, the corresponding coding sequence is operably linked to regulatory 
sequences which cause its expression in the chosen host plant cell. Examples of 
promoters capable of functioning in plants or plant cells, i.e., those capable of driving 
expression of the associated coding sequences such as MDMV-B CP in plant cells, indude 
the cauliflower mosaic virus (CaMV) 19S or 35S promoters and CaMV double promoters; 
nopaiine synthase promoters; pathogenesis-reiated (PR) protein promoters; small subunit of 
ribulose bisphosphate cartsoxylase (ssuRUBISCO) promoters; plant ubiouitin gene 
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promoters; plant actin gen promoters: ptant pith-prefen-ed promoters, and the iike. 
Preferred are the rice actin promoter (McBroy et al., Mol. Gen. Genet. 231 : ISO (1991)), 
maize ubiquitin promoter (EP-A-342 926; Taylor et al.. Plant Cell Rep.12: 491 (1993): Toki 
et al.. Plant Phys. 100:1503-1507 (1992)). a maize pith-prefenred promoter (WO 93/07278 
incorporated by reference herein in its entirety; in particular see Figure 24 and pages 27* 
28), and the Pr-i promoter from tobacco, Arabidopsis, or maize (see EP-A-332 104). Also 
prefen-ed are the 35S promoter and an enhanced or double 35S promoter such as that 
described in Kay et al., Screnca 236: 1299*1302 (1987) and the double 35S promoter 
cloned into pCGN21 13, deposited as ATCC 40587. The promoters themselves may be 
modified to manipulate promoter strength to increase expression of MDMV-B coding 
sequences in accordance with art-recognized procedures. 

The chimeric DNA construct(s) of the invention may contain multiple copies of a 
promoter or multiple copies of a particular coding sequence. In addition, the construct(s) 
may include coding sequences for markers and coding sequences for other peptides such 
as signal or transit peptides, each in proper reading frame with the other functional 
elements in the DNA molecule. The preparation ot such constructs are within the ordinary 
level of skill in the art. 

Since the MDMV-B proteins are naturally expressed as part of a polyprotein, each 
protein does not include its own translation initiation ana translation stop codon. To express 
such proteins individually in the context of a chimeric gene, a translation initiation codon will 
need to be added immediately adjacent to the first codon if one does not occur as part of 
the coding sequence. The skilled artisan will recognize that addition of such a translation 
initiation codon will add a methionine amino acid to the end of the encoded protein. Such 
an addition is no: contemplated to have any significant effect upon the properties of the 
protein. Also, a translation stop codon will need to be added to the chimeric gene 
immediately after the last codon of the protein if one does not naturally occur at this 
location. 

Useful markers include peptides provioing heroicide. antibiotic or drug resistance, 
such as, tor example, resistance to hygromycin, kanamycin, G418, gentamycin, iincomycin. 
methotrexate, gtyphosate, phosphinothricin. or the like. These markers can be used to 
select ceils transformed with the chimeric DNA constructs of the invention from 
untransrormed cells. Other useful markers are peptidic enzymes which can be easily 
detected by a visible reaction, for example a color reaction, for example luciferase, 
(3-giucuronidase, or 3-galactosidase. 
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Standard recombinant DNA and molecular cloning techniques used in the following 
examples are well known in the art and are described by J. Sambrook, E. F. Fritsch and T, 
Maniatis, Molecular Cloning: A Laboratory manual . Cold Spring Harbor iaboTatory, Cold 
Spring Hartaor, NY (1989) and by TJ. Silhavy. M.L Bennan, and LW. Enauist Experiments 
with Gene Fusions . Cold Spring Harbor Laboratory, Cold Spring HartDor, NY (1984). 

Example 1 : Construction of Plant Transformation Vectors 

Numerous transformation vectors are available for plant transformation, and the 
genes of this invention can be used in conjunction with any such vectors. The selection of 
vector for use will depend upon the preferred transformation technique and the target 
species for transformation. For certain target species, different antibiotic or heitoicide 
selection markers may be preferred. Selection markers used routinely in transformation 
include the nptll gene which confers resistance to kanamycin and related antibiotics 
(Messing & Vienna, Gene 19: 259-268 (1982); Bevan et ai.. Nature 304:184-187 (1983)), the 
bar gene which confers resistance to the herbicide phosphinothricin (White et aL, Nucl. 
Acids Res. 18: 1062 (1990), Spencer et al., Theor. Appi. Genet. 79; 525-631(1990)). the 
hph gene which confers resistance to the antibiotic hygromycin (Blochinger & Diggelmann, 
Mol Ceil Biol 4: 2929-2931 (1984)), and the dhfr gene, which confers resistance to 
methotrexate (Bourouis et al., EMBO J. 2(7): 1099-1104 (1983)). 

1. Construction of Vectors Suitable for Aorobacterium Transformation 

Many vectors are available for transfonnation using Agrobacterium tumefaciens. 
These typically carry at least one T-DNA border sequence and include vectors such as 
pBIN19 (Bevan, Nucl. Acids Res. (1984)). Below the construction of two typical vectors is 
described. 

1.1. Construction of DC1B200 and dC!B2001 

The binary vectors pCIB200 and pC!B2001 are used tor the construction of 
recombinant vectors for use with Agrobacterium and was constructed in the following 
manner. pTJS75kan was created by Narl digestion of pTJS75 (Scnmidhauser & Heiinski. J 
Bacteriol. 164: 446-455 (1985)) allowing excision of the tetracyciine-resistance gene. 
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followed by insertion of an Acct fragment from pUC4K carrying an NPTII gen (Messing & 
Vierm, Gene 19: 259-268 (1982); Sevan et al.. Nature 304: 184-187 (1983); McBride et al„ 
Plant Molecular Biology 14: 266-276 (1990)). Xhol linkers were ligated to the EcoflV 
fragment of pClB7 whiah contains the left and right T-DNA borders, a plant selectable 
nos/nptll chimenc gene and the pUC poiyllnker (Rothstein et Gene 53: 153-161 (1987)), 
and the Xhol-digested fragment was cloned into Sal)-digested pTJS75kan to create 
pC!B200 (see also EP-A-332 104. example 19). pClB200 contains the following unique 
polylinker restriction sites: EcoRI, SstI, Kpnl, Bglll, Xbal, and Sail. pCIB2001 is a derivatrve 
of pCIB200 which was created by the insertion into the poiyiinker of additional restriction 
sites. Unique restriction sites in the polylinker of pCIB2001 are EcoRK SstI, Kpnl, BgllK 
Xbal, Sail, Mlul. Bcil, Avril, Apal, Hpai, and Stul. pC!B200l. in addition to containing these 
unique restriction sites also has plant and bacterial kanamycin selection, left and right T- 
DNA borders for Agrobacterium-mediated transformation, the RK2-derived trfA function for 
mobilization between coli and other hosts, and the OnT and OriV functions also from 
RK2. The pC!B2001 poiyiinker is suitable for the cloning of plant expression cassettes 
containing their own regulatory signals. 

1.2. Construction of dCIBIO and Hvoromvcin Selection Derivatives thereof 

The binary vector pClBlO contains a gene encooing Kanamycin resistance for 
selection in plants, T-DNA right and left border sequences and incorporates sequences 
from the wide host-range plasmid pRK252 allowing it to replicate in both E. coli and 
Agrobacterium. Its construction is described by Rothstein et al.. Gene 53: 153-161 (1987). 
Various derivatives of pCIBIO have been constructed which incorporate the gene for 
hygromycin B phosphotransferase descnbed by Gntz et aL, Gene 25: 179-188 (1983)). 
These derivatives enable selection of transgenic plant cells on hygromycin only (pCIB743)» 
or hygromycin and kanamycin (pCIB715, pClB717). 

2. Construction of Vectors Suitable for non-Agrobactenum Transformation . 

Transformation without the use of Agrobacterium tumelaciens circumvents the 
requirement for T-DNA sequences in the chosen transformation vector and consequently 
vectors lacking these sequences can be utilized in Idition to vectors such as the ones 
described above which contain T-DNA sequences. Transrormation techniques which do not 
rely on Agrobacterium include transformation via particle domoardment, protoplast uptake 
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(e.g. PEG and etectroporation) and microinjection. The choice of veaor depends largely on 
the prefen^ed selection for the species being transformed. Below, the construction of some 
typical vectors is described. 

2.1 Construction of dC!B3064 

pCIB3064 is a pUC-derived vector suitable for direct gene transfer techniques in 
combination with selection by the herbicide basta (or phosphinothricin). The plasmid 
pCIB246 comprises the CaMV 35S promoter in operational fusion to the E. coli GUS gene 
and the Caf^V 35S transcriotional terminator and is described in WO 93/07278. The 35S 
promoter of this vector contains two ATG sequences 5' of the start site. These sites were 
mutated using standard PGR techniques in such a way as to remove the ATQs and 
generate the restriction sites Sspl and Pvull. The new restriction sites were 96 and 37 bp 
away from the unique Sail site and 101 and 42 bp away from the actual start site. The 
resultant derivative of pClB246 was designated pC!B3025. The GUS gene was then 
excised from pC!B3025 by digestion with Sail and SacL the termini rendered blunt and re- 
ligated to generate plasmid pCIBSOBO. The piasmid pJIT82 was obtained from the John 
Innes Centre, Norwich and a 400 bp Smal fragment containing the bar gene from 
Streptomyces viridochromogenes was excised and inserted into the Hpal site of pCIB3060 
(Thompson et al. EMBO J 6: 2519-2523 (1987)). This generated pCIB3064 which 
comprises the bar gene under the control of the CaMV 35S promoter and terminator for 
herbicide selection, a gene fro ampidilin resistance (for selection in E. coli) and a polylinker 
with the unique sites SphI, PstI, iHindlll, and BamHI. This vector is suitable for the cloning 
of plant expression cassettes containing their own regulatory signals. 

2.2 Construction of dS0G19 and dSOG35 

pSOG35 is a transformation vector which utilizes the E. coli gene dihydrofoiate 
reductase (DHFR) as a selectable marker conferring resistance to methotrexate. PGR was 
used to amplify the 35S promoter (-800 bp), intron 6 from the maize Aahl gene (-550 bp) 
and 18 bp of the GUS untranslated leader sequence from pSOGlO. A 250 bp fragment 
encoding the E. coli dihydrofoiate reductase type II gene was also amplified by PGR and 
these two PGR fragments were assembled with a Sacl-Psti fragment from pB!221 
(Clontech) which comprised the pUCl9 vector backbone and the nopaiine synthase 
terminator. Assembly of these fragments generated pSOGiB which contains the 35S 
promoter in fusion with the tntron S sequence, the GUS leader, the DHFR gene and the 
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nopaline synthase terminator. Replacement of the GUS leader in pSOGiS with the leader 
sequence from Maize Chlorotic Mottle Vinjs (MCMV) generated the vector pSOG35, 
pS0G19 and pSOG35 carry the pUC gene for ampicillin resistance and have Hindlll, Sphl. 
PstI and EcoRI sites available for the cloning of foreign sequences. 

Example 2: Construction of Plant Expression Cassettes 

Gene sequences intended for expression in transgenic plants are firstly assembled in 
expression cassettes behind a suitable promoter and upstream of a suitable transcription 
terminator to create a chimeric gene. These expression cassettes can then be easily 
transferred to the plant transformation vectors described above in Example 1. 

Promoter Selection 

The selection of a promoter used in expression cassettes or chimeric genes will 
determine the spatial and temporal expression pattern of the transgene in the transgenic 
plant. Selected promoters will express transgenes in specific cell types (such as leaf 
epidermal cells, mesophyll cells, root cortex cells) or in specific tissues or organs (roots, 
leaves or flowers, for example) and this selection will reflect the desired location of 
expression of the transgene. Altemativeiy. the selected promoter may drive expression of 
the gene under a iight-induced or other temporally regulated promoter. A further alternative 
is that the selected promoter be chemically reguiated. This would provide the possibility of 
inducing expression of the transgene only when desired and caused by treatment with a 
chemical inducer. 

Transcriptional Terminators 

A variety of transcriptional terminators are available for use in expression cassettes. 
These are responsible for the termination of transcnption beyond the transgene and its 
con-ect poiyadenylation. Appropriate transcnptional terminators and those which are known 
to function in plants and include the CaMV 35S terminator, the tml terminator, the nopaline 
synthase terminator, the pea rocS E9 terminator. These can be used in both 
monocotyledons and dicotyledons. 
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Sequences for the Enhancement or Reouiation of Expression 

Numerous sequences have been found to enhance gene expression from within the 
transcriptional unit and these sequences can be used in conjunction with the genes of this 
invention to increase their expression in transgenic plants. 

Various intron sequences have been shown to enhance expression, particularty in 
monocotytedonous cells. For example, the introns of the maize Adh1 gene have been 
found to signtficantiy enhance the expression of the wild-type gene under its cognate 
promoter when introduced into maize cells. Intron 1 was found to be particularty effective 
and enhanced expression in fusion constructs with the chloramphenicol acatyltransferase 
gene (Cailis et al., Genes Develop, 1: 1183-1200 (1987)). In the same experimental 
system, the Intron from the maize bronzel gene had a similar effect in enhancing 
expression (Callis et al., supra). Intron sequences have been routinely incorporated into 
plant transformation vectors, typically within the non-translated leader. 

A number of non-translated leader sequences derived from viruses are also known to 
enhance expression. Specifically, leader sequences from Tobacco Mosaic Virus (TMV, the 
•W-sequence"). Maize Chlorotic Mottle Virus (MCMV). and Alfalfa Mosaic Vims (AIMV) have 
been shown to be effective in enhancing expression (e.g. Gatlie et al. Nucl. Acids Res. 15; 
8693-8711 (1987); Skuzeski etal. Plant Moiec. Biol. 15: 65-79 (1990)) 

Example 3: Transformation of Monocotyledons 

Transfonmation of monocotyledon species such as wheat or maize has become 
routine. Preferred techniques include direct gene transfer into protoplasts using PEG or 
electroporation techniques, and particle bombardment into callus tissue. Transfonnations 
can be undertaken with a single DNA species or multiple DNA species (i.e. co- 
transformation) and both these techniques are suitable for use with this invention. Co- 
transformation may have the advantage of avoiding complex vector construction and of 
generating transgenic plants with unlinked loci for the gene of interest and the selectable 
mariner, enabling the removal of the selectaoie marker in subsequent generations, should 
this be regarded desirable. However, a disadvantage of the use of co-transformation is the 
less than 100% frequency with which separate DNA species are integrated into the genome 
(Schocher et a(. Biotechnology 4: 1093-1096 (1986)). 

EP-A-292 435 (to Ciba-Geigy), 5P-A-392 225 (to Ciba-Geigy) and WO 93/07278 (to 
Ciba-Geigy) describe techniques for the preparation of callus ana protoplasts from an elite 
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inbred line of maize, transformation of protoplasts using PEG or eiectroporation, and the 
regeneration of maize plants from transformed protoplasts. Gordon-Kamm et al.. Plant Cell 
2: 603-618 (1990)) and Fromm et al.. Biotechnology 8: 833-839 (1990)) have published 
techniques tor transfomnation of Al SB-derived maize line using particle bomoardment 
Furthermore, WO 93/07278 (to Ciba-Geigy) and Koziel et al.. Biotechnology 11: 194-200 
(1993)) describe techniques for the transformation of elite inbred lines of maize by particle 
bombardment. This technique utilizes immature maize embryos of 1.5-2.5 mm length 
excised from a maize ear 14-15 days after pollination and a PDS-lOOOHe Biollstics device 
for bombardment. 

Transformation of rice can also be undertaken by direct gene transfer techniques 
utilizing protoplasts or particle bombardment. Protoplast-mediated transformation has been 
described for Japonica-types and Indica-types (Zhang et al.. Plant Ceil Rep 7: 379-384 
(1988); Shimamoto et al. Nature 338: 274-277 (1989); Datta et al. Biotechnology 8: 736-740 
(1990)). Both types are also routinely transformable using particle bombardment (Christou 
et al. Biotechnology 9: 957-962 (1 991 )). 

EP-A-332 581 (to Ciba-Geigy) describes techniques tor the generation, transformation 
and regeneration of Pooideae protoplasts. These techniques allow the transformation of 
Dactyiis and wheat. Funhermore, wheat transformation has been described by Vasil et aL, 
Biotechnology 10: S67-674 (1992)) using panicle bombardment into cells or type C long- 
temn regenerable callus, and also by Vasil et aJ., Biotechnology 11: 1553-1558 (1993)) and 
Weeks et al., Plant Physiol. 102: 1077-1084 (1993) using particle bombardment of 
immature embryos and immature embryo-oerived callus. A preferred technique for wheat 
transformation, however, involves the transformation of wheat by particle bombardment of 
immature embryos and includes either a high sucrose or a high maltose step prior to gene 
delivery. Prior to bombardment any number of embryos (0.75-1 mm in length) are plated 
onto MS medium with 3% sucrose (see Murasnige & Skoog, Physiologia Plantarum IS: 473- 
497 (1962)) and 3 mg/l 2.4-D for induction of somatic embryos which is allowed to proceed 
in the dark. On the chosen day of bombardment, emoryos are removed from the induction 
medium and placed onto the osmoticum (i.e. induction meaium with sucrose or maltose 
added at the desired concentration, typically 15%). The embryos are allowed to piasmolyze 
for 2-3 h and are then bombarded. Twenty embryos per target plate is typical, although not 
critical. An appropriate gene-canying plasmid (such as pCIB3064 or pSG35) is precipitated 
onto micrometer size goto particles using stanoard procedures. Eacn plate of embryos Is 
shot with the DuPont Bioiistics helium device using a burst pressure of -1000 psi using a 



wo 97/02352 



PCT/EP96««673 



-21 - 

Standard 80 mesh sere n. After bombardment, the embr/os are placed back into the dark 
to recover for about 24 h (still on osmoticum). After 24 hours, the embr/os are removed 
from the osmoticum and placed back onto induction medium where they stay for about a 
month before regeneration. Approximately one month later the embryo explants with 
developing embryonic callus are transferred to regeneration medium (MS ^ 1 mg/icter NAA, 
5 mg/liter GA), further containing the appropriate selection agent (10 mg/l basta in the case 
of pCIB3064 and 2 mg/l methotrexate in the case of pSOG35). After approximately one 
month, developed shoots are transferred to larger sterile containers known as "GATs" which 
contained half-strength MS, 2% sucrose, and the same concentration of selection agent. 

Example 4: MOMV-8 Resistacne Conferred by Expression of Translationaily Altered 

Viral Transhpts 

Our research has focused on cloning and sequencing the remainder of the MDMV-B 
genome. We have disclosed the majority of the MDMV-B sequence in this application. We 
have identified coding regions within the MDMV-3 coding region based on conserved motifs 
previously identified in other potyviruses. The regions of the virus selected for use as 
transgenes have been the MDMV-B non-structural proteins (i.e. Replicase, Proteinase* and 
Heiicase), These regions were targeted based on the expected higher degree of sequence 
conservation within these genes among strains of MDMV. We predict that the use of these 
regions will give the highest probability of obtaining resistance to multiple strains of MDMV 
when transformed into elite maize inbreds. The sequences have been used to transform 
maize plants for the purpose of conferring virus resistance. 

Maize dwarf mosaic virus strain B (MDMV-3) was obtained from Dr. S. Jensen 
(University of Nebraska-Lincoln) and maintained in a susceptible maize inbred by serial 
inoculation. Virus was prepared for inoculation as previously described (see Law, M. D. et 
al. Phytopathology 79:757-761 , (1 988)). 

The virus was purified from two week old infected maize tissue by the following 
protocol. The harvested tissue was homogenized with 0.2 sodium acetate, pH 5.0 
containing 0.1% b-mercaptoethanol (1:6 ratio W:V) in a blender The homogenate was 
filtered through cheesecloth and then centrifuged for 15 minutes at 6000 HPM (Son/all GSA 
rotor). The recovered supernatant was then filtered thrcugn glass wool and adjusted to a 
concentration of 0.5% Triton X-100 and 0.2M NaCl. The virus was precipitated from the 



wo 97/02352 



-22 



PCT/E!Wa2673 



solution by adding PEG 8000 (8% final concentration) and then stirring for 2 hours at 4_C. 
The virus was recovered by centrifugation for 15 minutes at 8,000 RPM (Sorvall GSA rotor). 

The resulting pellet containing the virus was resuspended by stirring in 0.1M Tris pH 
S.5 containing 0.032 M sodium citrate. The virus solution was darified by centrifugation 
through a 20% sucrose pad for 2 hours at 28,000 RPM (SW28 rotor). The recovered pellet 
was resuspended in 10 ml of 0.1 M Tris pH 6.5 containing 0.032 M sodium citrate. The 
supematent was adjusted to a concentration of 34% cesium sulfate and centrlfuged for 14 
hours at 48,000 RPM (Ti 70.1 rotor). The opalescent band containing the virus was 
removed and diaiyzed against 0.1 M Tris pH 6.5 containing 0.032 M sodium citrate. Viral 
RNA was isolated from the purified virions by phenol extraction and ethanol predpttation. 

The isolated RNA was then used as template for cDNA preoaratlon using oiigo dT 
primers. The preparation of cDNA clones were performed by standard procedures as 
described (see Sambrook, J. et al„ iMolecular Cloning: A Laboratory Manual", Cold Spring 
Haroor Laboratory Press, (1 989)). 

Constructs were prepared to specific regions of the MDMV-B genome by PGR 
amplification from cDNA clones. The region amplified by PGR was typically 1200 to 1400 
nucleotides in length and was confirmed by sequendng. Constructs were prepared to the 
regions of the MDMV-B genome which encode the viral repiicase (Nib), proteinase (NIa). 
and cyiindricai inclusion protein (GIP). These regions were seieaed basea on the higher 
sequence conservation within these regions between members of the potyvirus family. The 
constructs corresponding to a specific viral coding region were altered during PGR 
amplification by nucleotide substitutions within the primers. A methionine translation 
initiation codon was generated at the first codon preceding the first native codon and a 
termination codon was created at the seventh codon in all constructs tested. This would 
create a mRNA only capable of translating small peptides. The constructs were then ligated 
into either the pUBA plasmid (see Toki et al. Plant Physiol. 100:1503-1507, (1992)) or the 
pClB4421 plasmid. The pUBA plasmid contained the Ubiouitin promoter and the NOS 
terminator while pCIB442"! contained the maize pnospnoenoipyruvate carboxylase (PEPC) 
promoter and the 35S terminator. The plasmid constructs were then verified by DNA 
seauencing. 

The constructs used in this example to transform maize olants have been designated 
pClBSOlB and pCIB 5019. pCiB5018 was constructed by iigating the PGR amplified NIa 
fragment (nucieotioes 4452 -5744 of SEQ ID No. 1) into pClB442i. The NIa fragment used 
tor ligation had previously been altered by insenion of an ATG codon immediately before 
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the first nucleotide of th first codon (i.e. the G at position 4452 of SEQ ID No. 1 ) and 
substitution of a thymidine (T) for the adenine (A) at nucleotide 4470 of SEQ ID No. 1 to 
create a premature stop codon. pCIB5019 was constructed by ligating the altered NIa 
fragment described above into the pUBA plasmid. 

Microproiectile Bombardment Protocols 

Plasmid DNA was precipitated onto 1mm gold microcarrier particles as described in 
the DuPont Bioiistic manuaf, 5mg of plasmid DNA containing a synthetic phosphinothrycin 
acetyltransferase selectable marker gene and 5mg of either pCIB5018 or pCIB5019 were 
added per 50ml of prepared microcarrier. The synthetic phosphinothrycin acetyltransferase 
selectable marker gene provides resistance to the same selection agents as the BAR gene 
(see Kramer, C. et ai. Planta 190: 454-468 (1993)). Bombardment of tissue was carried out 
with the DuPont PDS-lOOOHe Bioiistic" device. An additional 150x150mesh/iinear inch 
screen was inserted 2cm below the stopping screen. Immature embryos were bombarded 
with 1550psi njpture discs on a plate angled 6-8cm below the stopping screen to maximize 
scutellum exposure to particles. Type I callus was placed 4cm below the stopping screen 
and 900psi rupture discs were used in bombardment. All plates for both explant types were 
bombarded twice. 

Immature Embrvo Explant Source Initiation and Selection 

Immature embryos of a proprietary Ciba elite line (CG00526) were used as the initial 
explant source in microprojectile-mediated transformation. Embryos were excised from the 
ears 10-14 days post-pollination, when 1-2mm in length. After surface sterilization in a 10% 
Clorox solution, embryos were plated embryonic axis down on the surface of the agar- 
solidified medium. Embryos were plated onto Duncan's "D" callus induction medium plus 
5mg/l chloramben, 2% sucrose, l2mM proline and either the organic amendments specified 
in Duncan's (2DG4) or a modified version (2DA1) which omits the casein hydrolysate and 
adds the amino acids minus glutamine and asoaragine from Kao and Michayluk's "KM" 
medium (see Kao and Michayluk, Planta 126:105-110. (1975)). The plated embryos were 
kept in a 2S_C dark culture room continuously until the regeneration pnase was initiated. 
The day after plating the embryos were transferred to the appropriate G4 or A1 media 
containing 12% sucrose at least four hours prior to microprojectile bombardment. Thirty-six 
emcryos were arranged in a 2-3 cm circle in the center of the plate. The embryos remain 
on the 12% sucrose plate overnight after bombardment. The following day. embryos were 
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transferred either to 2DG4 5 chloramben + the equivalent of a 1 0mg/I concentration of 
Basta**herbidde (giufosinate ammonium) or 2DA1 -r5 chloramben + 5mg/l Basta. 

Fourteen days from the initial excision and plating, deveioping compact, organized 
type I callus was excised from the original explant and subcultured to either 2DG4 -r 0.5mg/l 
2,4-0 ^ 10mg/i Basta or 2DA1 -i-0.5mg/l 2,4-0 * 5mg/I Basta. Viable, healthy callus was 
serially subcultured every founeen days during the selection phase. All tissue was then 
transferred to Duncan's medium, modified by omitting alt amino acids, plus 2% sucrose, 
0.5mg/I 2,4-0 and 10mg/l Basta (20G8) at the end of eight weelcs. After a two week 
passage on the Q8 medium, ait living tissue was transferred to regeneration medium. 

Type i Exptant Source Initiation and Selection 

Immature embryos of the Ciba elite line (CG00526) were plated embryonic axis down 
onto 2DG4 + 5 chloramben at the 1 -2mm length size. The developing compact, highly 
organogenic (type i) callus was excised from the original embryo explant after fourteen days 
and maintained serially on 2DG4 ^ 0.5mg/l 2,4-0 by subculturing to fresh medium every 
ten-rourteen days. When the callus lines obtained were two to three months old, they were 
prepared for microprojectile bombardment. The tissue was subcultured to fresh medium in 
small pieces approximately T-3mm in size one lo two days prior to bomoarament. On the 
day of bombardment, the tissue was arranged in a 2-3cm circle in the center of a DA1 plate 
containing 12% sucrose and 0.5mg/l 2.4-0 at least four hours prior to bombardment. The 
callus was kept on the plate after bombardment overnight, and transterred the next day to 
2DA1 +0.5mg/l 2,4-0 + lOmg/lBasta. Viable, healthy callus was serially subcultured on the 
same medium every fourteen days during the selection phase. All tissue was transferred to 
Ouncan's medium, modified by omitting ail amino adds, plus 2% sucrose, 0.5mg/l 2.4-0 and 
1 0mg/1 Basta (2DG8) at the end of eight weei<s. After a two week passage on the G8 
medium, all living tissue was transferred to regeneration medium. 

Reoeneraiion and P'antlet Establisnment of Immature £mbn/o and Tvpe I Exolant Source 
Sxpenments 

Tissue for regeneration was moved to a 25.C light culture room under a 16 hour 
photoperiod. Hegeneration medium consisted of Murashige and Skocg's (MS) salts and 
vitamins, 3% sucrose ^ 0.25mg/l ancymidol. l.Omg/l NAA . 0.5mg/l kinetin and 5mg/[ Basta. 
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After a two week passage on the regeneration medium with growth regulators, the tissue 
was transferred to MS medium -r 3mg/IBasta and no additional growth regulators. Plantlets 
reaching 1-3cm length were transferred from plates to Magenta^GA-T boxes containing MS 
medium (0.75X concentration+ 1 % sucrose) amd no Basta for root development. Pfantlets 
with sufficient root development were transplanted to soil and moved to the greenhouse. 
Plantlets were hardened off in a 70% humidity phytotron for one to two weeks before 
moving the plants to the greenhouse range. The greenhouse conditions were as follows: 
55% humidity. 400 Einsteins light intensity^ 1 6 hour photoperiod, aO-84_F Day temperature. 
64-68_F Night temperature. Plants were allowed to grow to maturity in the greenhouse and 
were either selfed or backcrossed to the parental line in the Ti generation. 

Analysis of T n Plants 

To plantlets were first assayed by polymerase chain reaction (PGR) to detect the 
selectable marker, the gene of interest and an alcohol dehydrogenase (Adh) gene 
sequence as an internal assay control. Plantlets were assayed at approximately eight to 
fourteen cm height, when the plantlets were still in the GA-7 boxes. Standard PGR 
conditions were used {see Kramer, C. et al. Planta 190; 454^58 (1993)). The Adh internal 
control primer pair sequence was TGCATGTGGGTTGTGTTGCA (SEQ ID NO. 3) and 
CTCAGCAAGTAGCTAGACCA (SEQ ID No. 4). The primer pair sequence for the synthetic 
PAT gene was TGTGTCCGGAGAGGAGACC'(SEQ ID No. 5) and 
CCAACATCATGCCATGCACC (SEQ ID No. 5). The primer pair sequence for the NIa 
proteinase gene is GCGGGATCCATGGGGAAGAAGAAACGCAGTTGA (5') (SEQ ID No. 7) 
and GCGGAGCTCTTAGTCTTCAACGCTCGGGTC (3*) (SEQ ID No. 3). The parameters for 
PGR amplification for all primer pairs were 45 sec at 94 30 sec at 62_C. 30 sec at 72_C 
plus a 3 sec'cycle extension elongation for 40 cycles. 

Plantlets identified by PGR to be transformed were analyzed by Northern blot assay 
for mRNA transcript of the gene of interest (NIa proteinase). Plants were assayed for 
mRNA expression either while in the GA-7 containers or when the plants had been 
acclimated in the greenhouse. The probe was a 1303 bp fragment of the NIa gene excised 
by a BamHi/SacI restriction digest of the pClB50l9 plasmid. Labeling was carried out with 
the Gibco/BRL RadPrime DNA Labeling kit as described by the manufacturer. Northern blot 
protocols were performed as described (see Sambrook. J. et al.. "Molecular Cloning: A 
Laboratory Manual". Cold Spring Hartor Laboratory Press. (1989)). 
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Analysis of T i Plants 

seed harvested from the Tq plants was first dried down in the drying room for one 
to two weeks before planting. Seed were planted directly in flats and watered in. The flats 
were bottom watered with either a 0,15% volumeA/olume Basta solution or with water two 
days after planting. Four different transformation events were tested for herbicide and 
disease resistance in this example, as well as the wild*type elite control. Forty seeds from 
each individual transformed plant were tested initiaity, 20 in Sasta and 20 in the water 
control. Seven days after.the first Basta drench, a second drench was carried out in the 
same manner. 

All plants were inoculated with MOMV-B following the second Basta soil drench when 
the plants were 4-5 inches in height (3-5 Leaf Stage). A second virus inoculation was 
performed on all plantlets 4-6 days after the first inoculation to insure infection. Plants were 
scored for viability in the plus and minus Basta drench and for the presence or absence of 
viral symptoms at the end of two and a half weeks. 

Plants which showed resistance to the virus, as measured by the absence of viral 
symptoms, and a susceptible sibling were assayed by Northern blot analysts using the NIa 
fragment as described above. The resistant plants were also assayed by ELISA and 
Western blot analysis for the presence of MDMV-B coat protein in the plants. 

ELISA and Western Blot anaivsis of the transgenic plants. 

The primary antibody used for both assays was a polyclonal antibody specific for the 
MDMV-B coat protein vfhich was obtained from Dr. S. Jensen (University of Nebraska- 
Lincoln). The second antibody was an affinity purified polyclonal igG alkaline phosphatase 
labeled goat anti-rabbit antibody (Kirkegard and Perry Laboratories, Gaithersburg. 
Maryland). 

ELISA Anaivsis 

Tissue samples were taken from alt plants not exhibiting charactenstic MDMV-B 
symptoms and from one infected plant. Samples were also taken from r.ealthy and infected 
CG00S26 plants as controls. The samples (two leaf ptrnches-l cm in diameter) were taken 
from both the inoculated leaf and the youngest available leaf. The tissue samples were 
homogenized in 0.400 ml of borate buffered saline (lOOmM boric acid, 25mM sodium 
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borate, 75mM sodium chloride). Aliquots (50ml) of each sample were applied t a ethanol 
washed ELISA plate and incubated overnight at The plates were then washed once 
with ELISA wash buffer (10mM Tris-HCl, 0.05% Tween-20, 0.02% sodium aztde). and 
blocked with ELISA block/diluent (lOmM sodium phosphate, 140mM sodium chloride, 
0.05% Tween-20, 1% BSA, 0.02% sodium azide) for one hour at room temperature. The 
plates were washed three times with EUSA wash buffer. The primary antibody was applied 
at a 1 :5000 dilution in 50ml of ELISA block/diluent and incubated for 2 hours at 37.C and 
then washed three times with EUSA wash buffer. The second antibody was applied at a 
concentration of 1.5mg/ml in ELISA block/diluent and incubated for 2 hours at 37_C. The 
plates were washed three times with EUSA wash buffer and were developed by incubation 
in EUSA substrate (Kirkegard and Perry) for 30 minutes at room temperature. The reaction 
was stopped by the addition of 50ml of 3M sodium hydroxide. The plates were read with a 
SLT 340 ATTC ELISA plate reader (SLT Labinstruments) at 405nm. 

Western Blot Analysis 

Western blot analysis was performed on samples used for EUSA analysis. A 2ml 
aliquot of the samples was diluted into lOmt of 1 X loading dye {Novex Inc). The samples 
were eiectrophoresed on an 8%-16% Tris-glycine polyacryiamide gel (Novex) in Tris running 
buffer (25mM Tris-Base, 192mM glycine, 0.1%SDS) at 120 volts for approximately 2.5 
hours. The gel was blotted onto nitrocellulose using a Biorad blotting apparatus in transfer 
buffer (25mM Tris-Base, l92mM glycine, and 10% methanol) at 120 volts for 45 minutes. 
The filter was blocked with blocking/diluent (IX TBS, (20mM Tris-Base, SOOmM NaCl, pH 
7.5), 0.05% Tween-20. 1% BSA, 5% lamb serum) at room temperature for 45 minutes. The 
filter was incubated with the primary antibody, described above, at a dilution of 1 :1000 in 
blocking/diluent at room temperature for 1 .25 hours. The filter was washed for five minutes 
in 1XTTBS, (IX TBS, 0.05% Tween-20). The second antibody, described above, was 
incubated with the filter in blocking/diluent at a dilution of 1 :1000, for 1 .25 hours at room 
temperature. The filters were washed twice for 5 minutes in 1XTTBS followed by a single 
wash in 1XTBS for 5 minutes. The filter was developed with Nitro Blue Tetrazolium (NBT) 
and 5-bromo-4-chioro-3-indolyl phosphatase (BCIP) in 0.1M Tris-HCl pH 9.5 as described 
by the manufacturer. The filter was developed for approximately 20 minutes and then 
stopped by washing the filter with water. 
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Characterization of the MDMV-3 Genome 

Clones have been isolated and sequenced representing 8530 nucleotides of the 
MDMV-3 genome. We have identified a single large open reading frame as'would be 
expected of a virus belonging to the potyvirus family. We have identrfied regions of the 
polyprotetn which would encode the coat protein (nucleotides 7308-8291 of SEQ ID No. 1 
and amino acids 2436-2763 of SEQ ID No. 2), the putative RNA dependent HNA 
polymerase (RdRp) tenned Nib (nucleotides 5745-7307 of SEQ ID No. 1 and amino adds 
1915-2435 of SEQ ID No. 2), the NIa proteinase (nucleotides 4452-5744 of SEQ ID No. 1 
and amino acids 1484-1914 of SEQ ID No. 2), the 6K protein (nucleotides 4293-4451 of 
SEQ ID No. 1 and amino acids 1431-1483 of SEQ ID No. 2). cylindrical inclusion protein 
(CIP) containing the helicase(nucieotides 2376-4292 of SEQ ID No. 1 and amino adds 792- 
1430 of SEQ ID No. 2), P3 proteinase (nudeotides 1 134-2375 of SEQ ID No. 1 and amino 
acids 378-791 of SEQ ID No. 2), and a portion of the helper component-P2 proteinase (HC- 
Pro)(nucleotides 3-1133 of SEQ ID No. 1 and amino acids 1-377 of SEQ ID No. 2). 
Identification was based on the location of putative cleavage sites and conserved motifs. 
The MDMV-B sequence of the CP region from our isolate was 99% identical to the 
previously sequenced MDMV-3 CP and 78% identical to the MDMV-A CP. Further 
comparisons could not be made due to the lack of additional sequence to ether MDMV 
strains. The sequence of MDMV-3 was then compared to other potyviruses and was found 
to exhibit approximateiy 60% nucleotide sequence identity to other potyviruses. The level 
of identify varied little when sequences encoding the different proteins were used for the 
comparison. 

Tn Analvsis 

Eighteen lines (individual transformation events from selection and regeneration) were 
obtained from the experiments in this example. 17 of the 1 8 lines were positive by PCS for 
the selectable marker, and 14 for the gene of interest. All 14 events which were PCR 
positive for the NIa gene were also positive for expression in the Northern analysis. The 
preaominate mRNA soecies was approximately 1300 nucleotides in length which would 
correspond to the predicreo size of the transgene, A smaller species approximately 1000 
nucleotides in length was also detected whicn most likely arose by processing. Differences 
in mRNA expression levels were seen between diflerent events as well as between 
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individual plants (siblings) from a given event. All PGR positive plants were used for seed 
production (Ti ). 

Tj^Analysis 

Four plants from two different events were identified to be resistant to the virus Inoculation 
as evidenced by the absence of visual symptoms. There was no correlation to Basta 
tolerance in this example. Northern analysts of the four plants showed no detectable NIa 
transcript in the four resistant plants, while an infected sibling plant from the same original 
ear (Tq) was shown to have high levels of viral RNA. The levels of MDMV-B in the infected 
sibling was similar to the levels seen in the control CG00526 plants. 

The resistant plants were also evaluated for the presence of viral coat protein by 
ELISA. 

The four values obtained for each sample, duplicate samples from the inoculated leaf and 
non-inoculated leaf, were averaged and a comparison made against the infected and 
healthy controls. No detectable vims was present in the resistant transfomned plant lines by 
ELISA at which the threshold of detection was approximately 2 ng of virus per sample. In 
contrast, the transformed siblings which exhibited symptoms contained levels of virus 
similar to that seen in the infected CS00526 control plants. These results show conclusive 
evidence that the four plants were immune to MDMV-B infection (i.e. not supporting virus 
replication). The resistance was durable in that the resistant plants withstood two 
inoculations with high MDMV-B inoculum concentrations. The inoculum concentrations 
used in these experiments typically result in symptoms within four days in susceptible plant 
lines. Yet, the resistant plants have not produced visible symptoms nor detectable virus six 
weeks following inoculation. 



Various modifications of the invention describee herein wilt become apparent to those 
skilled in the art. Such modifications are intended to fall within the scope of the appended 
claims. 
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SEQUm^: LISTING 



(1) GENERAL DJPORMftnCH: 



(i) APPLICANT: 

(A) NAME: CIBA-GEIGV' AG 

(B) sraEET: Klyfaeckstir. 141 

(C) CnY: Basel 

(E) CCUMIRy: Switzerland 

(F) POSTAL CODE (ZIP) : 4002 

(G) TELSPHCNE: -^41 61 69 11 11 

(H) TEISM: + 41 61 696 79 76 

(I) lELcX: 962 991 



(ii) TITLE OF INVEOTTON: USE OF TRANSLATICNALLY ALTERED RNA TO CCNFSR 

RESI£raNCE TO MAIZE DWARF MOSAIC VIRDS AND OTHER 
M2NXOTXIS3CNaJS PLANT VIRDSES 

(iii) NUMBER OF SBCOENCES: 8 



(iv) CCMEOTm REAEAHLE FORM: 

(A) MEDTCM TYPE: Floppy disk 

(B) CCMEOTSl: IBM PC corpatible 

(C5 OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Pacentla Release 41.0, Version #1.3CB 



(2) INPORMATICN FOR SEQ ID N0:1: 



(i) SEQUENCE CHARAL'1H<ISTICS : 

(A) LEHjTH: 8543 base pairs 

(B) T!^PE; nucleic acid 

(C) STRANnKTTJESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TypE: RNA (genomic) 



(iii) H^rrPCraETICAL: NO 



(ix) 



FEATURE: 

(A) NAME/XEV: CDS 

(B) LOCATION: 3. .3291 
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(D) ClHra /product = "polyprotein encoded by 

MCMV-3 genome** 

(ix) FEATURE: 

(A) NAME/KE!f: 3'OTR 

(B) UXZATICN: 8292.. 8530 

(ix) FEATORE; 

(A) NSME/KEy: miscJNA 

(B) LCXMaCN: 3.. 1133 

(D) 0TO21 INPORMATICN: /products "3-priinB sequence for 

HC-Pro- 

(ix) FEATURE: 

(A) NAME/KEV: miscJNA 

(B) LOCATICN: 1134.. 2375 

(D) OTHER mPORMATICM: /products "PS proteinase" 

(ix) FEATORE: 

(A) NAME/KEY: miscJNA 

(B) UXZAnCN: 2376.. 4292 

(D) CTHTO 2I3FX3RMAnCN: / p roducts "cylindrical inclusicn 

protein" 

(ix) FEATURE: 

(A) NAME/KEif: irdsc^RNA 

(B) LOCATICN: 4293.. 4451 

(D) INPORMAnCN: /products "K2 (fikD protein) " 

(ix) FEATORE: 

(A) NAME/KEV: inisc_3NA 

(B) LOCATICN: 4452.. 5744 

(D) 0*iUER ]23PCRMAnCN: /products "Nla proteinase" 

(ix) FEATORE: 

(A) NAME/IEf: rcdsc^RNA 

(B) IjOCATICN: 5745.. 7307 

(D) OTHER ^TORMATICN: /products "Nib r^licase" 

(ix) FEATURE: 

(A) NAME/KEY: misc.RNA 

(B) bOCATICN: 7308.. 8291 

(D) (JlHhK J^JFORMATION; /products "coat protein" 



(Xi) SBCUSNCE DESCRIPTK^I: SBQ ID N0:1; 

UC GAA GAG AAA CAA CGA GAG UAU CJU GCA AAG GAU CAA AAA CUA UCC 47 
Glu Glu Lys Gin Arg Glu Tyr Leu Ala Lys Asp Gin Lys Leu Ser 
15 10 15 



AGA AUG AUA CAA uUU AUC AAA GAA AGG UGC AAU CCA AAA UUU UCG CAU 
Arg Met lie Gin ?he He Lys Glu Arg Cys Asn Pro Lys Phe Ser His 

20 25 30 



95 
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UUA CCA ACG COA UGG CAA GUC GCG GAA ACA AUA GC5G CAC UHJ AOJ GfiD 143 
Leu Pro "nir Leu Trp Glji Vai Ala Glu Tta He Gly His T/r Thr Asp 

35 40 45 

AAC CAG UCA AAG CAA AUA AIX3 GMJ GCJCJ AGC GAA GCG CJC ADC AAA GOa ISl 
Asn Gin Ser Lys Gin lie Mec Asp Vai Ser Glu Ala Leu He Lys Val 
50 55 SO 

AAU ACJ CJG ACJ CO: GAtJ GAL^ GCJ ADG AAA GCA AGC GCA GCG UUA CTU 239 
Asn Ihr Leu liir Pro Asp Asp Ala Met Lys Ala Ser Ala Ala Leu Leu 
65 70 75 

GAA GOG UCG CGA U3G UftU AAG AAU OGU AAG GAG UCA CX AAA ACJ GAC 287 
Glu Val Ser Arg Trp Tyr Lys Asn Arg Lys Glu Ser Leu Lys Thr Asp 
80 35 90 95 

UCA UOG GAA UCJ XJOU AGA AAU AAA AUA UCA CCA AAG AGU ACA AUA AAU 335 
Ser Leu Glu Ser Phe Arg Asn Lys He Ser Pro Lys Ser Thr He Asn 

100 105 UO 

GCA GCU UUA AUG UGC GAU AAU GAA UTO GAU AAA AAU GCA AAU UUU GOA 383 
Ala Ala Leu Men Cys Asp Asn Gin Leu Asp Lys Asn Ala Asn Phe Val 

115 120 125 

UGG G3U AAU AGG GAA UAC CAC GCC AAA CGA LTTJ UUC GCA AAC UAU UUU 431 
Itp Gly Asn Arg Glu "Tyr His Ala Lys Arg Phe Phe Ala Asn Tyr Phe 
130 135 140 

NAA GCA GUG GAU CCC ACA GAU GCA UAU GAA AAG CAC GUC ACA 035 UOC 479 
Xaa Ala Val Asp Pro Ibr Asp Ala Tyr Glu Lys His Val Ita" Arg She 
145 ISO 155 

AAC ecu AAU GGU CAA CGA AAG UUA UCA AUA GGA AAG UUA GUU AUC CCA 527 
Asn Pro Asn Gly Gin Arg Lys Leu Ser He Gly Lys Leu Val He Pro 
160 165 170 175 

COA GAC UUU CAA AAG AUU AGA GAA UCA UUC GUU GGA CUC UCG AUA AAU 575 
Leu Asp Phe Gin Lys He Arg Glu Ser Phe Val Gly Leu Ser He Asn 

180 185 190 

AGA CAA CCG CUG GAU AAA UGU UGU GUU AGC AAG AUC -GAA GGA GGG UAU 623 
Arg Gin Pro Leu Aso Lys Cys Cys Val Ser Lys He Glu Gly Gly Tyr 

195 ' 200 205 

AUA UAC CCA UGU UGC UGC GUC ACA ACA GAA UUU GGU AAA CCA GCA UAC 671 
He Tyr Pro Cys Cys C/s Vai Thr Thr Glu Phe Gly Lys Pro Ala Tyr 
210 215 220 



JCU GAG AUA AUA CCU -CCA ACG AAA GGG CAU AUA ACA AUA GGC AAU UCU 719 
Ser Glu He He Pro Pro Vnr Lys Gly His He Ita He Gly Asn Ser 
225 230 235 

alt; GAU UCA AAG AUU GUG GAC UUG CCA AAU ACA ACA CCA CCC AGC ADS 767 
He Asp Ser Lys He Val Asp Leu Pro Asn Thr Thr Pro Pro Ser Met 
240 245 250 255 
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UAC AUU GCU AAG GMJ GGG UAD UGC UAC AUC AAC AUC UUa UUA GCA OCC 315 
Tyr lie Ala Lys Asd Gly Tyr Cys Tyr lie Asn lie Phe Leu Ala Ala 

260 265 270 

ADS AUC AAC GCJU AAU GAA GAA UOJ GCC AAG GftU UAU ACG AAA UUU UOG 863 
Met lie Asn Val Asn Glu Glu Ser Ala Lys Asp Tyr Thr Lys Phe Leu 

275 280 285 

AGG GAC GAA CUA OJJ GAG 030 COC GGA AAG UQG CCA AAG OJU AAA GAC 911 
Arg Asp Glu Leu Val Glu Arg Leu Gly Lys Trp ?ro Lys Lai Lys Asp 
290 295 300 

GOA GCA ACA GCG OGU UAU GCA UUA UC3 GOA AU3 WU CCA GAA AUU AAG 959 
Val Ala Thr Ala Cys Tyr Ala Leu Ser Val Met Phe Pro Glu lie Lys 
305 310 315 

AAU GCU GAG OJA CCU CCA AL\J CUA GUU GAC CAU GAA AAU AAA UCA AUG 1007 
Asn Ala Glu Leu Pro Pro lie Leu Val Asp His Glu Asn Lys Ser Met 
320 325 330 335 

CAC GUA AUC GAU UCA UAU G5U UCA CUA AGC GUU GGA UUU CAC AUA UUA 1055 
His Val lie Asp Ser Tyr Gly Ser Leu Ser Val Gly Phe His He Leu 

340 345 350 

AAA GCA AGC ACG AUU G3U CAA UUA AUC AAA UUU CAA UAU GAG UCU AUG 1103 
Lys Ala Ser Hir He Gly Gin Leu He Lys Phe Gin Tyr Glu Ser Met 

355 360 365 

GAU AGU GAA AUG CGC GAA UAC AUA GUA GGA GGA ACU CUC ACA CAA CAG 1151 
Asp Ser Glu Met Arg Glu Tyr lie Val Gly Gly Thr Leu Thr Gin Gin 
370 375 380 

ACA UUC AAC ACA CUU CUU AAG AUG CUU ACG AAA AAC AUG UUC AAA CCA 1199 
Thr Phe Asn T^ Leu Leu Lys Met Leu 1^ Lys Asn Met Phe Lys Pro 
385 390 395 

GAS CGC AUC AAG CAG AUA ALTJ GAA GAG GAA CCU LTJC UUA CUU AUG AUG 1247 
Glu Arg He Lys Gin He He Glu Glu Glu Pro Phe Leu Leu Met Met 
400 405 410 415 

GCG AUU GCG UCU CCA ACG GUA UUA AUA GCA CJA UAU AAU AAU UGU UAU 1295 
Ala He Ala Ser Pro Thr Val Leu He Ala Leu Tyr Asn Asn Cys Tyr 

420 425 430 

AUU GAG CAA GCU AUG ACA UAC UGG AUC GUU AAG AAU CAA GGA GUU GCA 1343 
He Glu Gin Ala Met Thr Tyr Trp He Val Lys Asn Gin Gly Val Ala 

435 440 445 



GCC AUA UUC GCA CAA CJC GAA GCA UUA GCC AAG AAA ACA UCC CAG GCU 1391 
Ala He Phe Ala Gin Leu Glu Ala Leu Ala Lys Lys Thr Ser Gin Ala 
450 455 460 



GAG CUA UUA GUU CJA CAA AUG CAG AUA CUU GAA AAA GCA UCU AAC CAA 
Glu Leu Leu Val Leu Gin Mec Gin He Leu Glu Lys Ala Ser Asn Gin 



1439 
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465 470 475 

im AGA UUA GCX GDU UCA GGA CTJU J^GC GAU AUG GAC CCA GCA AAG C33A 1487 
Leu Arg Lesu Ala Val Ser Gly Lea Ser His lie Asp Pro Ala Lys Arg 
480 485 490 495 

CJa UUG OGS UCA CAC CCKI GAA GOG AUG UCA ACA CGA -JCA GAA AU3 AAC 1535 
Leu Leu Trp Ser His Leu Glu Ala Met Ser 1^ Arg Ser Glu Met Asn 

500 505 510 

AAG GAG ;JUA AUA GCJ GAG QGG UAU GCA COA 'JAD GAC GAG CGC COA UAC 1583 
Lys Glu Leu lie Ala Glu Gly Tyr Ala Leu lyr Asp Glu Arg Leu Tyr 

515 520 525 

ACC CUG ADG GAA AAA AGU UAC GOA GAU CAA UOA AAC CAA UCA OSG GCA 1631 
IJir Leu Men Glu Lys Ser Tyr Val Asp Gin Leu Asn Gin Ser Trp Ala 
530 535 540 

GAA UUG UCA UAC U3J GGA AAA UOT UCA GCA AUA UGG CGU GUG WC AGA 1679 
Glu Leu Ser Tyr Cys Gly Lys Pbe Ser Ala lie Trp Arg Val Phe Arg 
545 550 555 

GCC AGG AAG UAU UAC AAA CCG LICIT UUA ACC GUG AGA AAA AGC GUA GAU 1727 
Val Arg Lys Tyr Tyr Lys Pro Ser hmi Tbr Veil Arg Lys Ser Val Asp 
560 565 570 575 

UUA GGC GCU GUA UAC AAU AUA UCA GCU AD3 CAU CUA AUA UCA GAU UUA 1775 
Leu Gly Ala Val Tyr Asn He Ser Ala Ttir His Leu He Ser Asp Leu 

580 585 590 

GCG CGG AAA AGU CAA GAU CAA GUC AGC UCU AUU UUA ACC AAA CUC CGC 1823 
Ala Arg Lys Ser Gin A^ Gin Val Ser Ser He Leu Thr Lys Leu Arg 

595 600 605 

AAC GGU UUU UAU GAU AAA UUA GAG AAA GUU AGA AUA CGA ACU AUA AAA 1871 
Asn Gly Phe Tyr Asp Lys Leu Glu Lys Val Arg He Arg Thr He Lys 
610 615 620 

ACS GUU UAU U3G UUU AUA CCU GAU AUA UUU AGA CUC GUG CAC AUA UUC 1919 
Thr Val Tyr Trp Phe He Pro Asp He Phe Arg Leu Val His He Phe 
625 630 635 

AUA GUU UUG AGU UUA UUA ACU ACC AUC GCU AAC ACU AUC AUA GUA ACJ 1967 
He Val Leu Ser Leu Leu Thr Thr He Ala Asn Thr He He Val Thr 
640 645 650 655 

AUG AAU GAC UAC AAG AAA UUG AAG AAG CAA CAA AGA GAA GAC GAA UAU 2015 
Met Asn .^sp Tyr Lys Lys Leu Lys Lys Gin Gin Arg Giu Asp Glu Tyr 

660 665 670 

GAA GCA GAA AUU AGC GAA GUU CGC AGA AUC CAU VCJ ACC UUA AUG GAA 2063 
Glu Ala Glu He Ser Glu Val Arg Arg He His 'er Thr Leu Mec Glu 

675 oBO - 585 

GAG CGG AAG GAC AAU CJG ACG uGU GAA CAA UUU AUJ GAG UAU AUG CGU 2111 
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Glu Arg l.ys Asp Asn hesa Tbr Cys Glu Gin ?he He Glu Tyr Met Arg 
690 695 700 

CMA AAU GAU CCA COG COA GOT GGA GNA ACA CUG GAC UUG AOJ CAC ACA 2159 
Xaa Asn His Pro Arg Leu Val Gly Xaa *Ihr Leu Asp Leu llir His ISir 
705 710 715 

GGa GOC AUA CaU GAA QGG AAA UCC AAU CUC GAA ACC AAU DUG G3\A CAG 2207 
Gly Val He His Glu Gly Lys Ser Asn Leu Glu Tbr Asn Leu Glu Gin 
720 725 730 735 

UCA AUG GCA GUU GGA ACC UOG AUA ACA AUG AUA CUU GAU CCA CAG AAA 2255 
Ser Met Ala Val Gly Bar Leu He Ihr Met lie Leu Asp Pro Gin Lys 

740 745 750 

AGC GAU GOT GCC UAU AAG GOG OUG AAC AAA AUG CGG ACA GUA AUU AGU 2303 
Ser Asp Ala Val TVr Lys Val Leu Asn Lys Met Arg Thr Val He Ser 

755 760 765 

ACA AUU GAA CAA AAC GUC CCA UUC CCU UCA GUG AAU UUC UCC AAC AUC 2351 
Thr He Glu Gin Asn Val Pro Phe Pro Ser Val Asn ?he Ser Asn He 
770 775 780 

UUA ACA ecu CCA GOG GCA CAA CAG AGU GUA GAU GUU GAU GAG CCA UUA 2399 
Leu "Thr Pro Pro Val Ala Gin Gin Ser Val Asp Val Asp Glu Pro Leu 
785 790 795 

ACA CUU AGC ACU GAU AAA AAU UUA ACA AUA GAC UUU GAC ACA AAU CAA 2447 
•Hir Leu Ser Ifer Asp Lys Asn Leu Ihr He Asp Phe Asp Tte Asn Gin 
800 305 810 815 

GAU OIA ecu GCC GAU ACA UOC AGU AAU GAU GUG ACA UUU GRA GAU UGG 2495 
Asp Leu Pro Ala Asp Ihr Phe Ser Asn Asp Val Hir Phe Xaa Asp Trp 

820 825 830 

UGG UCA \mJ CAA uTIA AGC AAC AAC AGA ACA GOG SCA CAC UAC CGA CNU 2543 
Trp Ser Xaa Gin Leu Ser Asn Asn Arg Thr Val Xaa His Tyr Arg Xaa 

835 340 345 

UGG GGG GAA AGU YCA UUG GAA UUC ACA CGA GAA AAC GCA GCC GAC ACG 2591 
Trp Gly Glu Ser Xaa Leu Glu Phe Thr Arg Glu Asn Ala Ala His Thr 
850 855 860 

AGC AUC GAA CUU GCA CAC UCA AAC AUU GAG AGG GAA UUC UUG CUU AGA 2639 
Ser He Glu Leu Ala His Ser Asn He Glu Arg Glu Phe Leu Leu Arg 
865 870 875 

GGA GCA GUC GGC L^G GGA AAA UCC ACU GGG UUA CCA UAC CAU CUU AGC 2687 
Gly Ala Vai Gly Ser Gly Lys Ser Thr Gly Leu Pro Tyr His Leu Ser 
880 885 890 895 

AUG CGC GGA AAA GUG OJG CUA CUA GAG CCU ACA AGA CCG CJA GCU GAG 2735 
Met Arg Gly Lys Val Leu Leu Leu Glu Pro Thr Arg Pro Leu Ala Glu 

900 905 910 
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AAC GUG OCT AGG CAA CUA CAA ':XSA CC3 CCA UUU AAC GUA AGU CCA ACU 2783 
Asn Vai Cys Arg Gin Leu Gin Gly Pro Pro Phe Asn Val Ser Pro Tbr 

915 920 925 

CUU CAA AUG CGU GGA UOA AGU UC: UUU GGA U3C ACU CCA AUC ACA ADC 2831 
hsM Gin Met Arg Gly Leu Ser Ser Phe Gly Cys Ttir Pro lie Ihr He 
930 935 940 

AUG ACA IXU GS3 UUC GCA UUG CAC AUG UAC GCA AAU AAU CCA GAU AAA 2879 
Met Itor Ser Gly Phe Ala Leu His Met Tyr Ala Asn Asn Pro Asp Lys 
945 950 955 

AUA UCa GAG UAC GMJ UUC AUA AUC UUU GAU GAA UGU CAU AUA AUG GAA 2927 
He Ser Glu Tyr Asp Phe lie He Phe Asp Glu Cys His He Met Glu 
960 965 970 975 

GCA CCA GCG AUG GCC UOU UAU O^^J UUA CJC AAA GAA UAU GAA UAU CGA 2975 
Ala Pro Ala Met Ala Ste Tyr Cys Leu Leu Lys Glu Tyr Glu Tyr Arg 

980 985 990 

GGA AAA AUU AUC AAG Gv3A UCA GCU ACG CCU CCA GGA AGG GAG U3U GAA 3023 
Gly Lys lie He Lys Vai Ser Ala Thr Pro Pro Gly Arg Glu Cys Glu 

995 1000 1005 

UUC ACA ACA GAA CAU CCA GUA GAC AUC CAU GUU UGU GAG .^AU CUA ACU 3071 
Phe Thr Thr Gin His Pro Val Asp He His Val Cys Glu Asn Leu Thr 
1010 1015 1020 

GAG CAA CAG UUU GUU AUG GAA CUC GGG ACJ GGU UCA ACC GCA GAU GCU 3119 
Gin Gin Gin Phe Val Met Glu Leu Gly Ihr Gly Ser Thr Ala Asp Ala 
1025 1030 1035 

ACG AAG UAC GGA AAU AAU ADC UUA GUU UAU GUA GCA AGC UAU AAU GAC 3167 
Thr Lys Tyr Gly Asn Asa He L®i Val Tyr Val Ala Ser Tyr Asn Asp 
1040 1045 1050 1055 

GUC GAU UCA UUG UCG CAA GCA OlA GUC GAA CUU AAA UUU UCC GUA AUC ' 3215 
Val Asp Ser Leu Ser Gin Ala Leu Val Glu Leu Lys Phe Ser Val He 

1060 1065 1070 

AAA GUG GAU GGC CGA ACA AUG AAA CAA AAC ACA ACA GGA AUC AUU ACA 3263 
Lys Val Aso Gly Arg Thr Mec Lys Gin Asn Thr Thr Gly He He Thr 

1075 1080 1085 

AAC GGU ACC GCA CAA AAG AAG UGU ULU GUU GUC GCA ACG AAU AUA AUU 3311 
Asn Gly Thr Ala Gin Lys Lys Cys Phe Val Vai Ala Thr Asn He He 
1090 1095 1100 

GAG AAU GGC GUC ACA CUA GAU AL^U GAU GUU GGU GUC GAC UUC GGA CUU 3359 
Glu Asn Gly Val Thr Leu Asp He Asp Val Gly Val Asp Phe Gly Leu 
1105 1110 1115 

AAA GUr UCA GCU GAC UUG GAC GUU GAC AAC AGG GCG GUA UUG UAU AAA 3407 
Lys Val Ser Ala Asp Z^u Asp Val Asp Asn Arg Ala Vai Leu Tyr Lys 
1120 ' 1125 1130 1135 
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CGC GUA AGU AUA UCA UMJ GGU GAA CUC AUA CAA CGA UCG GGU COT GCJU 3455 
Arg Val Ser lie Ser Tyr Gly Glu Leu lie Gla Arg Leu Gly Arg Val 

1140 1145 1150 

GQC AGA AAU AAA COJ GOT ACA GOT AOT OGA AUC GGA AAA ACA AUG AAA 3503 
Gly Arg Asn Lys Pro Gly "aBr Val lie Arg He Gly Lys Tte Mer Lys 

1155 1160 U65 

GOT UUG CAG GAA AUO CCA GCA AUG AUC GCA ACA GAG GCA GCC UUC AD3 3551 
Gly Leu Gin Glu He Pro Ala Met He Ala liir Glu Ala Ala She Met 
1170 1175 1180 

UOT OTC GCU UAC GOT COT AAA GOT AUC AOJ CAU AAU GOT UCA AOG ACC 3599 
Cys Phe Ala Tyr Gly Leu Lys Val lie Thr His Asn Val Ser 153: Ifer 
1185 1190 1195 

CAU COT GCA AAG UGC ACA GOT AAA CAA GCG AGA ACC AUG AUG CAA UOT 3647 
His Leu Ala Lys Cys Tlir Val Lys Gin Ala Arg Ite Met Met Gin Phe 
1200 1205 1210 1215 

GAA UUA UCA CCA UOT GUC AUG GOT GAG CUC GOT AAG UOT GAU GOT UCA 3695 
Glu Leu Ser Pro Phe Vad Met Ala Glu Leu Val Lys Phe Asp Gly Ser 

1220 1225 1230 

AUG CAU CCA CAA AUA CAU GAG GCA CUA GUA AAA UAC AAA COT AGA GAU 3743 
Met His Pro Gin He His Glu Ala Leu Val Lys Tyr Lys Leu Arg Asp 

1235 1240 1245 

UOT GUC AUA AUG CUC AGA CCG AAU GCA COT CCA AGG GUC AAU UUA CAU 3791 
Ser Val He Met Leu Arg Pro Asn Ala Leu Pro Arg Val Asn Leu His 
1250 1255 1260 

AAU UGG COT ACA GCC CGA CffiU UAU AAU AGA AUA GGA UOT UCA UUA GAA 3839 
Asn Trp Leu Thr Ala Arg Asp Tyr Asn Arg He Gly Cys Ser Leu Glu 
1265 1270 1275 

CUC GAA GAC CAC GUC AAA AOT CCG UAC UAC AOT AGG GGA GOT COT GAC 3887 
Leu Glu Asp His Val Lys He Pro Tyr Tyr He Arg Gly Val Pro Asp 
1280 ' 1285 1290 1295 

AAG UUG uAU GGA AAG CUA UAU GAU AUU AUC UUA CAG GAU AOT CCA AOT 3935 
Lys Leu Tyr Gly Lys Leu Tyr Asp He He Leu Gin Asp Ser Pro Thr 

1300 1305 1310 

AOT UGC UAC AOT AGA OTA UCA AOT GCG UOT GCA GOT AAA GUA GCA UAU 3983 
Ser Cys Tyr Ser Arg Leu Ser Ser Ala Cys Aia Gly Lys Val Ala Tyr 

1215 1320 1325 

AOT OTG CGA AOT GAU CCA UOT UCA OTU CCA AGA AC^ AUA GCA AUA AOT 4031 
Thr Leu Arg Thr Asp Pro Phe Ser Leu Pro Arg Thr He Ala He He 
1330 1335 1340 

AAU GCC UVA AUC ACG GAG GAG UAU GCG AAG AGA GAU CAC UAU COT AAC 4079 
Asn Ala Xaa He llir Glu Glu Tyr Ala Lys Arg Asp His Tyr Arg Asn 
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1345 1250 1355 

AUG mJ YCA AAC CCA DCD UCA UCA CAC OCA UlT UCA OT AAU C3GG U03 4127 
Met He ICaa Asn Pro Ser Ser Ser His Ala Phe Ser Leu Asn Gly Leu 
1360 1365 1370 1375 

GOG UOJ AUG ADC GCCJ AC:J AGA X2R0 AUG AAA GAC CAC ACA AAG GAG AAU 4175 
Val Ser Met: He Ala Utc Arg Tyr Mac Lys Asp His Thr Lys Glu Asn 

1380 1385 1390 

AUU GAC AAA CUC AUC AGA GOG CGO GAU CAA UUA CUU GAG UDO CAA GGU 4223 
He Asp Lys Leu He Arg Val Arg Asp Gin Leu Leu Glu Hie Gin Gly 

1395 1400 1405 

AOJ GGA ATO CAA UOD CAA GAU CCA UCA GAA OJC AUG GAA ADU GOG GCa 4271 
Hit Gly Met Gin PSie Gin Asp 3ro Ser Glu Leu Met Glu He Gly Ala 
1410 1415 1420 

cue AAC ACA GUU AUU CAC CAA GGA AUG GAC GCA AUU GCA GCU UGU ADU 4319 
Leu Asn Ibr Val He His Gin Gly Met Asp Ala He Ala Ala Cys He 
1425 1430 1435 

GAG UUA CAA GGA CGA UGG AAU GCD UCA CUU AUA CAA CGC GAU CUC COA 4367 
Glu Leu Gin Gly Arg Tcp Asn Ala Ser Leu He Gin Arg Asp Leu Leu 
1440 1445 1450 1455 

AUU GCA GGU GGA GUU UUU ADC GGA GGC AUU UUG AUG AUG UGG AGC CUA 4415 
He Ala Gly Gly Val Phe He Gly Gly He Leu Met Met Trp Ser Leu 

1460 1465 1470 

UUU ACU AAA UGG AGU AAC ACA AAU GUC UCA CAU C?iG GGG AAG AAC AAA 4463 
Phe Thr Lys Ttp Ser Asn Tbr Asn Vaii Ser His Gin Gly Lys Asn Lys 

1475 1480 1485 

CSC AGU AGA CAA AAA CUU OGA TJCC AAA GAA GCA AGA GAC AAC AAA UAU 4511 

Arg Ser Arg Gin Lys Leu Arg Phe Lys Glu Ala Arg Asp Asn Lys Tyr 
1490 1495 1500 

GCA UAU GAU GUC ACA GGA UCG GAA GAA UGC CUU GGC GAG AAU UUU GGA 4559 
Ala Tyr Asp Val Ite Gly Ser Glu Glu Cys Leu Gly Glu Asn Phe Gly 
1505 1510 1515 

ACA GCC UAU ACA AAG AAA GGU AAA GGA AAA GGA ACJ AAA GUU GGA CUC 4607 
Ihr Ala Tyr Thr Lys Lys Gly Lys Gly Lys Gly Thr Lyis Val Gly Leu 
1520 1525 1530 1535 

GGU GUG AAG CAG CAU AAA UL^ CAU AUG AUG UAC GGU UUC GAU CCC CAA 4655 
Gly Val Lys Gin His Lys Phe His Met Met Tyr Gly Phe Asp Pro Gin 

1540 1545 1550 

GAG UAC AAC CJA AUU CGG UUU GUC GAU CCA CJC ACG GGA GCA ACJ OJU 4703 
Glu Tyr Asn Leu He Arg Phe Val Asp Pro Leu Thr Gly Ala Thr Leu 

1555 1560 1565 

GAU GAA CAA AUC CAU GCC GAU AUA CGC UUA AUU CAA GAG CAC UUC GCU 4751 
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Asp Glu Gin lie His Ala Asp He Arg Leu He Gin Glu His ?he Ala 
1570 1575 1580 

GAA AUU CGU GAG GAG GCA GUG ADU AAU GAC ACA ADD GAA AGG GAG CAG 4799 
Glu He Arg Glu Glu Ala Val lie Asn Asp Hrr lie Glu Arg Gin Gin 
1585 1590 1595 

ADU UaC OGC AAU CCU GGA CUk CAA GCA UUU UUC ACIA GAA AAU GGG UCA 4847 
lie TVr Gly Asn Pro Gly Leu Gin Ala Phe She He Gin Asn Gly Ser 
1600 1605 1610 1615 

GCA AAC go; cog AGA god GAU UUA ACA CCA CAU UCA CCU ACA CGA GOU 4895 
Ala Asn Ala Leu Arg Val Asp Leu Tbr Pro His Ser Pro Ihr Arg Val 

1620 1625 1630 

GUC ACA GGU AAU AAC AUA OCA GOG UUC CCA GAA UAU GAA GGA ACA CUU 4943 
Val rbr Gly Asn Asn He Ala Gly Phe Pro Glu Tyr Glu Gly Itor Leu 

1635 1640 1645 

CGU CAG ACU GGA ACA GCU AUA ACU AUA CCC AUU GGU CAA GUC CCA AUC 4991 
Arg GLa Ibr Gly Tlir Ala He Thr He Pro He Gly Gin Val Pro He 
1650 1655 1660 

GCA AAU GAA GCA GGG GUU GCA CAC GAG UCA AAA UCC AUG AUG AAC GGG 5039 
Ala Asn Glu Ala Gly Val Ala His Glu Ser Lys Ser Met Met Asn Gly 
1665 1670 1675 

UUG GGU GAU UAC ACA CCA AUA UCG CAA CAA UUG UGU CUA GOA CAA AAU 5087 
Leu Gly Asp Tyr 'Thr Pro He Ser Gin Gin Leu Cys Leu Val Gin Asn 
1680 1S85 1690 1695 

GAC UCG GAU G3S GUA AAG CG5 AAU GUA UUU UCU AUU GGA UAU GGC UCA 5135 
Asp Ser Asp Gly Val Lys Arg Asn Val She Ser He Gly Tyr Gly Ser 

1700 1705 1710 

UAU CJU AUU UCA CCA GCG CAC UUA UUC AAA UAC AAC AAU GGU GAA AUA 5183 
Tyr. Leu He Ser Pro Ala His Leu Phe Lys Tyr Asn Asn Gly Glu He 

1715 1720 1725 

ACA AUU AGA UCA UCA .AGA GGA UUG UAC AAA AUU CGU AAU UCU GUG GAU 5231 
Ttor* He Arg Ser Ser Arg Gly Leu Tyr Lys He Arg Asn Ser Val Asp 
1730 1735 1740 

UUA AAA UUA CAU CCA AUU GCA CAC AGA GAC AUG GUC AUA AUU CAA CUC 5279 
Leu Lys Leu His Pro He Ala His Arg Asp Met Val He He Gin Leu 
1745 1750 1755 

CCA AAG GAU UUC CCA COG UUC CCA AUG CGC UUG AAA UUC GAA CAA CCA 5327 
Pro Lys Asp Phe Pro Pro Phe Pro Met Arg Leu Lys Phe Glu Gin Pro 
1760 1765 1770 1775 

tX::A CGA GAU AUG CGA GUC UGC CUA GUA GGA GUC ^AC UUC CAA CAG AAU " 5375 

Ser Arg Asp Met Arg Val Cys Leu Val Gly Vai . -^n Phe Gin Gin Asn 

1780 1785 1790 
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UAU AGC AOJ OTC ALr GOA UCA GSU ;GU AGU GUG ACA GCA CCA AAA GGA 5423 
Tyr Ser Hir Cys He Val Ser Glu Ser Ser Val llhr Ala Pro Lys Gly 

1795 1800 1305 

AAU GGA GAC UUU OSG AAA CMJ UGG AUA OCA ACA GUC GAC GGU CAA OSU 5471 
Asn Gly Asp Phe Trp Lys His Trp He Ser Thr Val Asp Gly Gin Cys 
1810 1815 1820 

GGA CJA CCA Uro GUA GAU ACJ AAG AGC AAA CAU AIJU GOC GGA AOT CAU 5519 
Gly Leu Pro Leu Val Asp Thr Lys Ser Lys His He Val Gly lie His 
1825 1830 1835 

AGU aXJ GCA 'JCA ACA AGU GGA AAC AOJ AAU UX UUU GUC GCJ GOG CCa 5567 
Ser Leu Ala Ser Ihr Ser Gly Asn Ito Asn Phe Phe Val Ala Val Pro 
1840 1845 1850 1855 

GAG AAC UUU AAU GAA UAC AUC AAU OSA CUC GUG CAA GCA AAU AAA 1X32 5615 
Glu Asn Phe Asn Glu Tyr He Asn Gly Leu Val Gin Ala Asn Lys Ttp 

1860 1865 1870 

GAA AAA GGA UGG CAC UAU AAU CCG AAU CUC AUA UCC UGG UGU GGA CUA 5663 
Glu Lys Gly Trp His Tyr Asn Pro Asn Leu He Ser Trp Cys Gly Leu 

1875 1880 1885 

AAU UUA 'GUU GAU UCA GCC CCA AAA GGU UUG UUU AAA ACG UC\ AAA UUG 5711 
Asn Leu Val Asp Ser Ala Pro Lys Gly Leu Phe Lys T!ir Ser Lys Leu 
1890 1895 1900 

GUA GAA GAC UUG GAC GCG AGC GUU GAA' GAC CAA UGC AAG AUC ACC GAA 5759 
Val Glu Asp Leu Asp Ala Ser Val Glu Glu Gin Cys Lys He TJir Glu 
1905 1910 1915 

ACA UGG CUC ACA GAG CAA UUA CAA GAU AAU UUA CAA GUG GUU GCG AAA 5807 
Thr Trp Leu Thr Glu Gin Leu Gin Asp Asn Leu Gin Val Val Ala Lys 
1920 1925 1930 1935 

UGU CCA GGC CAA CUA GUU ACC AAG CAU GUU GUU AAG GGU CAA CCA 5855 
Cys Pro Gly Gin Leu Val "ISar Lys His Veil Val Lys Gly Gin Cys Pro 

1940 1945 1950 

CAC UUU CAA UUG UAC UUA UCA ACA CAU GAC GAU GCU AAA GAA UAC UUC 5903 
His Phe Gin Leu Tyr Leu Ser Thr His Asp Asp Ala Lys Glu Tyr Phe 

1955 1960 1965 

GCA CrC AL^ CUU GGA AAA UAC GAC AAG AGU AGG CUU A.AC GCA GCU 5951 
Ala Pro *Mec Leu Gly Lys Tyr Asp Lys Ser Arg Leu Arg Ala Ala 
1970 1975 1980 

UUU AUC AAA GAC AUA UCA AAA UAU GCA AAA CCA AUU UAC AUU GGA GAA 5999 
Phe He Lys Asp He Ser Lys Ty^^ Ala Lys Pro He Tyr He Gly Glu 
1985 1990 1995 

AUC GAG UAU GAU AUC UUU GAU AGA GCU GUA CAG CGG GUU GUC AAU AUU 6047 
He Glu Tyr Asp He Phe Asp Arg Ala Val Gin Arg Val Vai Asn He 
2000 2005 . 2010 2015 
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cue AAA AAU GUU GSA AUG CAA CAA UGC GC7U UAU GUC ACA GAU GAA GAA 6095 
Leu Lys Asn Val Gly Met Gin Gin Cys Val Tyr Val Thr Asp Glu Glu 

2020 2025 2030 

GAA AUa UCJC AGA DCA a3U AAC CUG AAC GCA GCTJ GUC GGA GCA UUG OAU 6143 
Glu He ?he Arg Ser X^eu Asn Ijeu Asn Ala Ala Val Gly Ala Leu Tyr 

2035 2040 2045 

ACA GGA AAG AAG AAA AAU UAC UUJ GAA AAU UUU UCA AGC GAA GAC AAA 6191 
Thr Gly Lys Lys Lys Asn Tyr Phe Glu Asn Phe Ser Ser Glu Asp Lys 
2050 2055 2060 

GAA GAA ADC OOG ADG AGA UCC UGJ GAA OGU AUU UAC AAU GG5 CAA CTO 6239 
Glu Glu He Val Met Arg Ser Cys Glu Arg He Tyr Asn Xaa Gin Leu 
2065 2070 2075 

GGC GUA UGG AAI7 GGA UOG CCT AAA GOJ GAG A£JC AGA CCA AUA GAG AAA 6287 
Gly Val Trp Asn Gly Ser Leu Lys Ala Glu He Arg Pro He Glu Lys 
2080 2085 2090 2095 

ACC AUG CJ3 AAU AAG ACJ CGA ACC WC ACA GCG GCC CCA UUA GAA ACU 6335 
Thr Mec Leu Asn Lys Thr Arg TJir Phe Thr Ala Ala Pro Leu Glu Thr 

2100 2105 . 2110 

UUG or GGA GGA AAA GOS UGC GTO GAU GAU UUU AAU AAU CAA UUC UAU 6383 
Leu Leu Gly Gly Lys Val Cys Val Asp Asp Phe Asn Asn Gin Phe Tyr 

2115 2120 2125 

UCA CAU CAU UUA GAA GGU CCA UGG ACU GUU GGG AUA ACA AAA UUC UAU 6431 
Ser His His Leu Glu Gly Pro Trp Thr Val Gly He Thr Lys Phe Tyr 
2130 2135 2140 

GGA GGU UGG AAU CGC UUA CUG GAG AAG UUA CCA GAA GGA UGG GUU UAC 6479 
Gly Gly Trp Asn Arg Leu Leu Glu Lys Leu Pro Glu Gly Ttp Val TVr 
2145 2150 2155 

U3C GAU GCU GAC GGG UCU CAA UUU AGU L'CG UUA ACA CCA UAU CUC 6527 
Cys Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Thr Pro Tyr Leu 
2160 2165 2170 2175 

AUC AAU GCA GUA UUA AAU AUU CGA UUG CAA UUU AUG GAA GAU UGG GAU . . 6575 
He Asn Ala Val Leu Asn He Arg Leu Gin Phe Met Glu Asp Trp Asd 

2180 2185 ' 2190 

AUA GGA GCG CAA AOS CTA AAG AAC CUG UAC ACJ GAG AUU GUU UAC ACA 6623 
He Gly Ala Gin Met Leu Lys Asn Leu Tyr Thr Glu He Val Tyr Thr 

2195 2200 2205 

CCA AUC GCA ACG CCA GAC GGA UCA AUC GUG AAG AAA UUC AAA GGU AAC 6671 
Pro He Ala Thr Pro Asp Gly Ser He Val Lys Lys Phe Lys Gly Asn 
2210 2215 2220 

AAU AGC GGA CAA CCU UCU ACA GUA GUG GAC AAC ACA UUG AUG GUU AUA 6719 
Asn Ser Gly Gin Pro Ser Thr Val Val Asp Asn Thr Leu Met Val He 
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2225 2230 2235 

ALIA GOJ UUC AAC TMJ OCZ AUG CJk t}CA AGJ 3GvJ AUC AAA GAA GAA GAA o767 
lie Ala Phe Asn Tyr Ala Mat Leu Ser Ser Gly He Lys Glu Giu Glu 
2240 2245 2250 2255 

ADC GAU AAI7 U3C UGtJ AGA AUG UOC GCG AAU GGU GAU GAC UUA CUC COA 6815 
He Asp Asn Cys Cys Arg Mec Phe Ala Asn Gly Asp Asp Ij&i Leu Leu 

2260 2265 2270 

GCA GOG CAU CXU GAU UUa GAG UOC AUU UUA GAU GAA UTO CAA AAU GAC 6863 
Ala Val His Pro Asp Phe Glu Phe He Leu Asp Glu Phe Gin Asn His 

2275 2280 2285 

UUU GGG AAU CUU GGG COG AAC UUC GAA UUO ACA UCA CGA ACA CGA GAU . 6911 
Phe Gly Asn Lea Gly Leu Asn Phe Glu Phe Thr Ser Arg Thr Arg Asp 
2290 2295 2300 

AAA UCC GAA CUG UGG UUC AUG UCC ACA P£S^ GGC AUC AAG OAU GAA GGA 6959 
Lys Ser Glu Leu Ttp Phe Met Ser Thr Arg Gly He Lys Tyr Glu Gly 
2305 2310 2315 

AUU UAC AOA CCA AAG CUU GAG AAA GAA AGA AUA GUC GCC AUA CUU GAA 7007 
He Tyr lie Pro Lys Leu Glu Lys Glu Arg He Val Ala He Leu Glu 
2320 2325 2330 2335 

UGG OaU CGA UCA AAC UUG CCU GAA CAU AGG LTX3 GAA GCU AUA UGU OCA 7055 
Trp Asp Arg Ser Asn Leu Pro Glu His Arg I-eu Glu Ala He Cys Ala 

2340 2345 2350 

GCG AUG GUU GAG GCC UGG GSV UAU UCC GAU CUC GUU CAU ADA CGA 7103 
Ala Met Val Glu Ala Trp Gly Tyr Ser Asp Leu Val His Glu He Arg 

2355 2360 2365 

AAG UUC UAU GCG UGG CUU LTC GAA AUG CAA CCU UUU GCA AAU CUC GCA 7151 
Lys Phe Tyr Ala Trp Leu Leu Glu Met Gin Pro Phe Ala Asn Leu Ala 
2370 2375 2380 

AAA NAA GGG UUG GCC CCA UAC AUU GCC GAG ACA GCA CUC CGC AAU CUC 7199 
Lys Xaa Gly Leu Ala Pro Tyr He Ala Glu Ihr Ala Leu Arg Asn Leu 
2385 2390 2395 

UAU CJU GGA ACG GGU AUC AAA GAG GAA GAA AUU GAA AAA UAU CUU AAA 7247 
Tyr Leu Gly Thr Gly He Lys Glu Glu Glu He Glu Lys Tyr Leu Lys 
2400 2405 2410 2415 

CAA UUC AL^J AAG GAU CUU CCC GGA UAC AUA GAA GAU UAC AAU GAA GAU 7295 
Gin Phe He Lys .\sp Leu Pro Gly *Tyr He Glu Asp Tyr Asn Glu Asp 

2420 2425 2430 

GUA UUC CAU CAG UCG GGA ACU GUU GAU GCG GGU GCA CAA GGC GGC AGU 7343 
Val Phe His Gin Ser Gly T5ir Vai Asp Ala Gly Ala Gin Gly Gly Ser 

2435 2440 2445 

GGA AGC CAA GGG ACA ACA CCA CCA GCA ACA GGU AGU GGA GCA AAA CCA 7391 
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Gly Ser Gin Gly Tbr Thr Pro Pro Ala Thr Gly Ser Gly Ala Lys Pro 
2450 2455 2460 



GCC ACC UCA GGG GCA GGA UOJ OGU AGU GAC ACA GGA GOJ GGA ACQ G3U 7439 
Ala Thr Ser Gly Ala Gly Ser Gly Ser Asp Thr Gly Ala Gly Ibr Gly 
2465 2470 2475 

GOA ACJ GGA A3CJ CAA GCA AGG ACa GGC AGU GGC ACU GGG ACG GGA lOT 7487 
Val Thr Gly Ser Gin Ala Arg Tlir Gly Ser Gly Ihr Gly Hit Gly Ser 
2480 2485 2490 . 2495 

GGA GCA ACC GGA GGC CAA UCA GGA ICT GGA AGU GGC ACU GAA CAG GDCJ 7535 
Gly Ala Tlur Gly Gly Gin Ser Gly Ser Gly Ser Gly Thr Glu Gin Vail 

2500 2505 2510 

AAC ACG GGU UCA GCA GGA ACU AAU GCA ACU GGA GGC CAA AGA GAU AGG 7583 
Asn 'Thr Gly Ser Ala Gly Hxr Asn Ala llir Gly Gly Gin Arg Asp Arg 

2515 2520 2525 

GAU GUG GAU GCA GGC UCA ACA GGA AAA AUU UCU GUA CCA AAG OJC AAG 7631 
Asp Val Asp Ala Gly Ser Hur Gly Lys He Ser Val Pro Lys Leu Lys 
2530 2535 2540 

GCC AUG UCA AAG AAA AUG CSC X3UA CCU AAA GCA AAA GGA AAA GAU GUG 7679 
Ala Met Ser Lys Lys Met Arg Leu Pro Lys Ala Lys Gly Lys Asp Val 
2545 2550 2555 

CUA CAU UUG CMJ UUU CJA UUG ACA UAC AAA CCA CAA CAA CAA GAC AUA 7727 
Leu His Leu Asp Phe Leu Leu Hn: Tyr Lys Pro Gin Gin Gin Asp lie 
2560 2565 2570 2575 

UCA AAC ACU AGA GCA ACC AAG GAA GAG UUU GAU AGA UGG UAU GAU GCC 7775 
Ser Asn Ihr Arg Ala TSir Lys Glu Glu Phe Asp Arg Trp Tyr Asp Ala 

2580 2585 2590 

AUA AAG AAG GAA UAC GAA AUU GAU GAC ACA CAA AUG ACA GUU GUC AUG 7823 
He Lys Lys Glu Tyr Glu lie Asp Asp Thr Gin Met Thr Val Val Met 

2595 2600 2605 

AGU GGC CJU AUG GUA UGG UGC AUC GAA AAU GGU UGC UCA CCA AAC AUA 7871 
Ser Gly Leu Met Val Trp Cys He Glu Asn Gly Cys Ser Pro Asn He 
2610 2615 2620 

AAC GGA AAU UGG ACA AUG AUG GAU AAA GAU GAA CAA AGG GUC UUC CCA 7919 
Asn Glv Asn Zro Thr Met Met Asp Lys Asp Glu Gin Arg Val Phe Pro 
2625 2630 2635 

CX AAA CCG GUC ALTJ GAG AAU GCA LTTJ CCA .a^J UUC CGA CAA AUG 7967 
Leu Lys Pro Val He Glu Asn Ala Ser Pro Thr Phe Arg Gin He Met 
2640 2645 2650 2655 

CAU CAU UUC AGU GAU GCA GCU GAA GCG UAC AUA GAG UAC AGA ?J\C UCU 8015 
His His t*ie Ser Asd Ala Ala Glu Ala Tyr He Glu Tyr Arg Asn Ser 

2660 2565 2670 
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ACJ GAG CGA UWJ AIXS CCA AGA UAC GSA OJU GAG CGC AAU COC AOC GAC 8063 
Thr Glu Arg Tyr Mac Pro Arg Tyr Gly Leu Gin Arg Asn Leu Ihr Asp 

2675 2680 2685 

UAU AGC im OCA CG3 UAU GCA OJU GAU UUC UAU GAA AUG ACXJ UCA Ct3C 8111 
Tyr Ser Leu Ala Arg Tyr Ala Phe Asp Phe Tyr Glu Met Thr Ser Arg 
2690 2695 2700 

ACA CCTJ GCU AGA GCTJ AAA GAA GCC GAC ADS GAG AUG AAA GCC GCA OCA 8159 
Thr Pro Ala Arg Ala Lys Glu Ala His Met Gin Met Lys Ala Ala Ala 
2705 2710 2715 

GUU CGU GGU UCA AAC ACA OGA COG UUC GGU UUG GAU GGA AAU GUC OGC 8207 
Val Arg Gly Ser Asn Tlir Arg Leu Phe Gly Leu Asp Gly Asn Val Gly 
2720 2725 2730 2735 

GAG ACJ CAG GAG AAU AC\ GAG AGA CAC ACA GCJ OGC GAU GUU AGU OGC 8255 
Glu Thr Gin Glu Asn Thr Glu Arg His Thr Ala Gly Asp Val Ser Arg 

2740 2745 2750 

AAC AUG CAC UCU CUG UUG GGA GUG CAG CAA CAC GAC UAGUCUCCUG 3301 
Asn Met His Ser Leu Leu Gly Val Gin Gin His His 

2755 2760 

GAAACCCUGU UUGCAGUACC AAUAAUAUGU ACUAAUAUAU AGUAUUUUAG UGAGGUUUUA 8361 

CCUCGUCUUU ACUGUUUUAU UACGUAUGUA UUUAAAGCGU GAACCAGUCU GCAACAUACA 8421 

GGGUUGGACC CAGUGUGUUC UGGUGUAGCG UGUACUAGCG UCGAGCCAUG AGAUGGACUG 8481 

CACJGGGUGU GGUUUUGCCA CUDGUGUUQC GAGUCUCCL'G GUAAGAGACA AAAAAAAAAA 8541 

AA 8543 

(2) II3P0KMATICN PGR 32Q ID N0;2: 

(i) SBCUrNCS C-JARACTE3?ISTICS; 

(A) LHT^TIH: 2763 amino acids 

(B) TiTPS: amino actd 
iD) TOPOLOGY: linear 

(ii) >SDLECULS TS^PE: protein 

(xi) SBQU33CE DSSCHIPnCN; SEQ ID NC:2: 

Glu Glu Lys Gin Arg Glu Tyr Leu Ala Lys Aso Gin Lys Leu Ser Arg 
15 10 15 

Met He Gin Phe He Lys Glu Arg Cys Asn Pro Lys Phe Ser His Leu 

20 25 30 

Pro Thr Leu Trp Gin Val Ala Glu Thr He Gly :s Tvt Thr Asp Asn 
35 40 45 
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Gin Ser Lys Gin lie Met Asp Val Ser Glu Ala He Lys Val Asn 
50 55 60 

lasr Leu Itor Pro Asp Asp Ala Met Lys Ala Ser Ala Ala Leii Leu Glu 
65 70 75 80 

Val Ser Arg Trp Tyr Lys Asn Arg Lys Glu Ser Leu Lys Thr Asp Ser 

85 90 95 

Leu Glu Ser Phe Arg Asn Lys He Ser Pro Lys Ser Itxr He Asn Ala 

100 105 110 

Ala Leu Met Cys Asp Asn Gin Leu Asp Lys Asn Ala Asn Phe Val Trp 
115 120 125 

Gly Asn Arg Glu Tyr His Ala Lys Arg Phe Phe Ala Asn Tyr Phe Xaa 
130 135 140 

Ala Val Asp Pro Tfer Asp Ala Tyr Glu Lys His Val Thr Arg Phe Asn 
145 ISO 155 160 

Pro Asn Gly Gin Arg Lys Leu Ser He Gly Lys Leu Val He Pro Leu 

165 170 175 

Asp Phe Gin Lys He Arg Glu Ser Phe Val Gly Leu Ser He Asn Arg 

180 185 190 

Gin Pro Leu Asp Lys Cys Cys Ved. Ser Lys He Glu Gly Gly Tyr He 
195 200 205 

Tyr Pro Cys Cys Cys Val Uar Thr Glu Phe Gly Lys Pro Ala Tyr Ser 



10 215 220 



Glu He He Pro Pro Thr Lys Gly His He Thr He Gly Asn Ser He 
225 230 225 240 

Asp Ser Lys He Val Asp Leu Pro Asn Thr Thr Pro Pro Ser Met Tyr 

245 250 255 

He Ala Lys Asp Gly Tyr Cys TVr He Asn He Phe Leu Ala Ala Met 

260 265 270 

He Asn Val Asn Glu Glu Ser Ala Lys Asp Tyr Thr Lys Phe Leu Arg 
275 280 285 

Asp Glu Leu Val Glu Arg Leu Gly Lys Trp Pro Lys Leu Lys Asp Val 
290 295 300 

Ala Thr Ala Cys Tyr Ala Leu Ser Val Met Phe Pro Glu He Lys Asn 
305 310 315 320 

Ala Glu Leu Pro Pro He Leu Val Asp His Glu Asn Lys Ser Met His 

325 330 335 



Val He Asp Ser Tyr Gly Ser Leu Ser Val Gly Phe His He Leu Lys 
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340 345 350 

Ala Ser Ihr lie Gly Gin Leu lie Lys Phe Gin Tyr Glu Ser Met Asp 
355 360 365 

Ser Glu Mec Arg Glu Tyr lie Val Gly Gly Ihr Leu Thr Gin Gla Thr 
370 375 380 

Phe Asn Thr Leu Leu Lys Met: Leu Thr Lys Asn Met Phe Lys Pro Glu 
385 390 395 400 

Arg He Lys Gin lie lie Glu Glu Glu Pro r(he Leu Leu Met Met Ala 

405 410 415 

He Ala Ser Pro Tlir Val Leu He Ala Leu Tyr Asn Asn Cys Tyr He 

420 425 430 

Glu Gin Ala Met rbr Tyr Trp He Val Lys Asn Gin Gly Val Ala Ala 
435 440 445 

He Ste Ala Gin Leu Glu Ala Leu Ala Lys Lys Ttor Ser Gin Ala Glu 
450 455 460 

Leu Leu Val Leu Gin Met Gin He Glu Lys Ala Ser Asn Gin Leu 
465 470 475 480 

Arg Leu Ala Veil Ser Gly Leu Ser His He Asp Pro Ala Lys Arg Leu 

485 490 495 

Leu Trp Ser His Leu Glu Ala Met Ser Ihr Arg Ser Glu Met Asn Lys 

SCO 505 510 

Glu Leu He Ala Glu Gly Tyr Ala Leu Tyr Asp Glu Arg Leu Tyr Thr 
515 520 525 

Leu Met Glu Lys Ser Tyr Val Asp Gin Leu Asn Gin Ser Trp Ala Glu 
530 535 540 

Leu Ser Tyr Cys Gly Lys Phe Ser Ala He Ite Arg Val Phe Arg Val 
545 550 555 560 

Arg Lys Tyr Tyr Lys Pro Ser Leu tSar Val Arg Lys Ser Val Asp Leu 

555 570 575 

Gly Ala Val Tyr Asn He Ser Ala Tte His Leu lie Ser Asp Leu Ala 

580 585 590 

Ar;r Lys Ser Gin Asp Gin Val Ser Ser He Leu Thr Lys Leu Arg Asn 
595 600 605 

Gly Phe Tyr Asp Lys Leu Glu Lys Val Arg He Arg Thr He Lys Thr 
510 615 620 

Val Tyr Trp Phe He Pro Asp He Phe Arg Leu Val His He Phe He 
625 630 635 640 
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Val Leu Ser Leu Thr Ihr lie Ala Asn Thr He lie Val T!hr Met 

645 650 655 

Asn Asp Tyr Lys Lys Leu Lys Lys Gin Gin Arg Glu Asp Glu Tyr Glu 

660 665 670 

Ala Glu He Ser Glu Val Arg Arg lie His Ser llir Leu Met Glu Glu 
675 680 685 



Arg Lys Asp Asn Leu Hhr Cys Glu Gin Stoe lie Glu Tyr Met Arg Xaa 
690 695 

Asn His Pro Arg Leu Val Gly Xaa l&r Leu Asp Leu Thr His Thr Gly 

710 715 720 

Val He His Glu Gly Lys Ser Asn Leu Glu Thr Asn Leu Glu Gin Ser 

725 730 735 

Met Ala Val Gly Itor Leu He 15rr Met He Leu Asp Pro Gin Lys Ser 

740 745 750 

Asp Ala Val Tyr Lys Val Leu Asn Lys Met Arg Thr Val He Ser Tbr 
755 760 765 

He Glu Gin Asn Val Pro She Pro Ser Val Asn Phe Ser Asn ^^le Leu 
770 775 780 

■nir Pro Pro Val Ala Gin Gin Ser Val Asp Val Asp Glu Pro Leu Thr 
785 790 795 800 

Leu Ser Ihr Asp Lys Asn Leu -nir He Asp Phe Asp Thr Asn Gin Asp 

805 810 

LeuProAlaAspThrlteSerAsnAspValThrPheXaaAspTrpTrp 

320 825 830 

Ser Xaa Gin Leu Ser Asn Asn Arg Tte Val Xaa His -IVr Arg Xaa T-p 
835 340 845 

Gly Glu Ser Xaa Leu Glu Phe Thr Arg Glu Asn Ala Ala His Th- Ser 
850 855 860 



He Glu Leu Ala His Ser Asn He Glu Arg Glu Phe Leu Leu Arg Gly 

870 875 880 

Ala Val Gly Ser Gly Lys Ser Thr Gly Leu Pro Tyr His Leu Ser Met 

885 890 895 

Arg Gly Lys Val Leu Leu Leu Glu Pro Thr Arg Pro Leu Ala Glu Asn 

900 905 910 

Val Cys Arg Gin Leu Gin Gly Pro Pro Phe Asn Val Ser Pro Ihr Leu 
915 920 925 
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Gin Met Arg Gly l^efu Ser Ser Phe Gly Cys Thr Pro He Ifer He Met 
930 935 940 

Tlir Ser Gly Phe Ala Leu His Met Tyr Ala Asn Asn Pro Asp Lys lie 
945 950 955 960 

Ser Glu Tyr Asp Phe He He She Asp Glu Cys His He Met Glu Ala 

965 970 975 

Pro Ala Met Ala Phe Tyr Cys Leu Leu Lys Glu Tyr Glu Tyr Arg Gly 

980 985 990 

Lys He He Lys Vai Ser Ala Ihr Pro Pro Gly Arg Glu Cys Glu Phe 
995 1000 1005 

Ihr Thr Gin His Pro Val Asp He His Val Cys Glu Asn Leu Ihr Gin 
1010 1015 1020 

Gin Gin Phe Val Met Glu Leu Gly Tto Gly Ser Ihr Ala Asp Ala Hir 
1025 1030 1035 1040 

Lys Tyr Gly Asn Asn He Leu Val Tyr Val Ala Ser Tyr Asn Asp Val 

1045 1050 1055 

Asp Ser Leu Ser Gin Ala Leu Val Glu Leu Lys Phe Ser Val He Lys 

1060 1065 1070 

Val Asp Gly Arg Thr Met Lys Gin Asn Thr Ttir Gly He He Thr Asn 
1075 1080 . 1085 

Gly Thr Ala Gin Lys Lys Cys Phe Vai Vai Ala Thr Asn He He Glu 
1090 1095 1100 

Asn Gly Vai Thr Leu Asp He Asp Val Gly Val Asp Phe Gly Leu Lys 
1105 1110 1115 1120 

Val Ser Ala Asp Lau Asp Val Asp Asn Arg Ala Val Leu Tyr Lys Arg 

1125 1130 1135 

Val Ser He Ser Tyr Gly Glu Leu He Gin Arg Leu Gly Arg Val Gly 

1140' 1145 1150 

Arg Asn Lys Pro Gly Ihr Val He Arg He Gly Lys Thr Met Lys Gly 
1155 1160 1165 

Leu Gin Glu He Pro Ala Met He Ala Thr Glu Ala Ala Phe Met Cys 
1170 1175 1180 

VhB Ala Tyr Gly Leu Lys Val He Tbr His Asn Val Ser Thr Thr His 
1185 1190 1195 1200 

Leu Ala Lys Cys Thr Val Lys Gin Ala Arg Thr Met Met Gin Phe Glu 

1205 1210 1215 



Leu Ser Pro ^e 



Val Met Ala Glu Leu Val Lys Phe Asp Gly 



Ser Met 
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1220 1225 1230 

His Pro Gin He His Glu Ala lieu Val Lys Tyr Lys I^eu Arg Asp Ser 
1235 1240 1245 



Val lie Met Leu Arg Pro Asn Ala Lieu Pro Arg Val Asn His Asn 
1250 1255 1260 

Ttp Leu Itor Ala Arg Asp Tyr Asn Arg He Gly Cys Ser Leu Glu 
1265 1270 1275 1280 

Glu Asp His Val Lys He Pro Tyr Tyr Arg Gly Val Pro Asp Lys 

1285 1290 1295 

Lai Tyr Gly Lys Leu Tyr Aso lie He Leu Gin Asp Ser Pro Thr Ser 

1300 1305 1310 

Cys Tyr Ser Arg Leu Ser Ser Ala Cys Ala Gly Lys Val Ala Tyr Thr 
1315 1320 1325 

Leu Arg Tfcr Asp Pro Phe Ser Leu Pro Arg Thr He Ala He He Asn 
1330 1335 1340 

Ala Xaa He Thr Glu Glu Tyr Ala Lys Arg Asp His Tyr Arg Asn Met 
1345 1350 1355 1360 

He Xaa Asn Pro Ser Ser Ser His Ala Phe Ser Leu Asn Gly Leu Val 

1365 1370 1375 

Ser Met He Ala Thr Arg Tyr Met Lys Asp His Thr Lys Glu Asn He 

1380 1385 1390 

Asp Lys Leu He Arg Val Arg Asp Gin Leu Leu Glu Phe Gin Gly Ibr 
1395 1400 1405 

Gly Met Gla Phe Gin Asp Pro Ser Glu Leu Met Glu He Gly Ala Leu 
1410 1415 1420 

Asn Thr Val He His Gin Gly Miet Asp Ala He Ala Ala Cys He Glu 
1425 1430 1435 1440 

Leu Gin Gly Arg Trp Asn Ala Ser . Leu He Gin Arg Asp Leu Leu He 

1445 1450 1455 

Ala Gly .Gly Val Phe He Gly Gly He Leu Met Met Trp Ser Leu Phe 

1460 1465 1470 

Thr Lys Trp Ser Asn Thr Asn Val Ser His Gin Gly Lys Asn Lys Arg 
1475 1480 1485 

Ser Arg Gin Lys Leu Arg Phe Lys Glu Ala Asp Asn Lys Tyr Ala 
1490 1495 1500 

Tyr Asp Val Thr Gly Ser Glu Glu Cys Leu Gly Glu Asn Phe Gly Thr 
1505 1510 1515 1520 



wo 97/02352 



-50- 



PC17£P96/D2673 



Ala TVr ar Lys Lys sly Lys Gly Lys dy Thr Lys Val Gly Leu Gly 

-525 1530 ^535 

val Lys Gin g.^Lys Phe His Met Met lyr Gly ?he Gin Glu 

-545 2.550 

^ 1565 
Glu gn lie His Ala ASP ne Arg Leu He Gin Glu His Phe Ala Glu 

Ile^Arg Glu Glu Ala gl^ile Asn Asp Ihr Ile^Glu Arg Gin Gin He 
IVr Gly Asn Pro Gly 1^ Gin Ala Ete Phe_ He Gin Gly Ser Ala 



1610 



1615 



Asn Ala Leu ^ Val Asp His Ser Pro ^ Arg Val Val 

•^^^^ 1630 

ttr Gly ^ XI. Sly a. R» slu ^ 31U aiy uu ^ 

lo40 1545 
=^ ^^Sly »r Al. lie Jte n. Pro 11. oly =1. Pro 11. ai. 

Asn_Glu Ala Gly Val Ala His Glu Ser Lys Ser Met.Met Asn Gly Leu 

^^"^5 1680 
Gly ASP IVr ^ P^ Xl, 3- ^ ^ I-u Cys Leu Val Gin Asn Asp 



1690 



1695 



ser ASP Gly w Lys Ar. A«. Val Ser He Gly ^ Gly Ser .Vr 



1705 i7io 



i«i He Ser Pro Ala His Leu i^e Lvs -^/T- i=T, m 

1715 ^ Asn Asn Gly Glu He Thr 

-720 2725 

He Arg Ser Ser Axg Gly Leu Tyr Lvs arrr c „ , 

1730 i^c ^ ^ ^ Asn Ser Val Asp Leu 



1735 1740 



Lys^I^ His Pro He Ala^His Arg Asp Met yaI_He He Gin. Leu Pro 



^'-= 1760 



i-ys Asp Phe Pro Pro Phe Mer ar-r - 

1765 " ^ -^^^ "^^^ 2^ -^0 Ser 



1770 ""TVS 



Arg Asp Mec Arg Val Cys Leu Val Gly Val Asn Ph^ ^in r^ . ^ 

1780 i^oc ^ Gin Asn Tyr 



1785 2.790 



Ser Thr Cys lie Val Ser Glu Ser Ser VaT ^ . 

1795 fo^ ^ ™^ — o ^ys Gly Asn 



1800 2805 
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Gly Asp Phe Trp Lys His Trp He Ser Tlir Val Asp Gly Gin Cys Gly 
1810 1815 1820 

Leu Pro Leu Val Asp Iter Lys Ser Lys His lie Val Gly He His Ser 
1825 1830 1835 1840 

Leu Ala Ser Hir. Ser Gly Asn Thr Asn Phe Pfae Val Ala Val Pro Glu 

1845 1850 1855 

Asn Phe Asn Glu TVr He Asn Gly Leu Val Gin Ala Asn Lys 'Trp Glu 

1860 1865 1870 

Lys Gly Trp His Tyr Asn Pro Asn Leu He Ser Trp Cys Gly Leu Asn 
1875 1880 1885 

Leu Val Asp Ser Ala Pro Lys Gly Leu Phe Lys T!hr Ser Lys Leu Veil 
1890 1895 1900 

Glu Asp Leu Asp Ala Ser Val Glu Glu Gin Cys Lys He Tta* Glu '2hr 
1905 1910 1915 1920 

Trp Leu Thr Glu Gin Leu Gin Asp Asn Leu Gin Val Val Ala Lys Cys 

1925 1930 1935 

Pro Gly Gin Leu Val IJir Lys His Val Val Lys Gly Gin Cys Pro His 

1940 1945 1950 

Phe Gin Leu Tyr Leu Ser Thr His Asp Asp Ala Lys Glu TVr Phe Ala 
1955 1960 1965 

Pro Met Leu Gly Lys Tyr Asp Lys Ser Arg Leu Asn Arg Ala Ala Phe 
1970 1975 1980 

He Lys Asp He Ser Lys Tyr Ala Lys Pro He Tyr He Gly Glu He 
1985 1990 1995 2000 

Glu Tyr Asp He Phe Asp Arg Ala Val Gin Arg Val Val Asn He Leu 

2005 2010 2015 

Lys Asn Val Gly Met Gin Gin Cys Val Tyr Val Thr Asp Glu Glu Glu 

2020 2025 2030 

He Phe Arg Ser Leu Asn Leu Asn Ala Ala Val Gly Ala Leu Tyr Thr 
2035 2040 2045 

Gly Lys Lys Lys Asn Tyr Phe Glu Asn Phe Ser Ser Glu Asp Lys Glu 
2050 2055 2060 



Glu He Val Met Arg Ser Cys Glu Arg He Tyr Asn Xaa Gin Leu Gly 
2065 2070 2075 2080 

Val Trp Asn Gly Ser Leu Lys Ala Glu He Arg Pro He Giu Lys Thr 

2085 2090 2095 

Met Leu Asn Lys Thr Arg Thr Phe Thr Ala Ala Pro Leu Glu Thr Leu 



wo 97/02352 



-52- 



PCT/EP96/02673 



2100 2105 2110 

Leu Gly Gly Lys Val Cys Val Asp Asp Phe Asn Asn Gin Phe lyr Ser 
2115 2120 2125 

His His I-eu Glu Gly Pro Trp Thr Val Gly He '^ar Lys Phe Tyr Gly 
2130 2125 2140 

Gly Itp Asn Arg Lea Leu Glu Lys Leu Pro Glu Gly 1^ Val Tyr Cys 
2145 2150 2155 2160 

Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Tta Pro Tyr Leu He 

2165 2170 2175 

Asn Ala Val Leu Asn He Arg Leu Gin Phe Met Glu Asp Trp Asp He 

2180 2185 2190 

Gly Ala Gin Met Leu Lys Asn Leu TVr Thr Glu He Val Tyr Tbr Pro 
2195 2200 2205 

He Ala Har Pro Asp Gly Ser He Val Lys Lys Pha Lys Gly Asn Asn 
2210 2215 2220 

Ser Gly Gin Pro Ser Hir Val Val Asp Asn Tfcr Leu Met Val He He 
2225 2230 2235 2240 

Ala Phe Asn Tyr Ala Miat Leu Ser Ser Gly He Lys Glu Glu Glu He 

2245 2250 2255 

Asp Asn Cys Cys Arg Met Phe Ala Asn Gly Asp Asp Leu Leu Leu Ala 

2260 2265 2270 

Val His Pro Asp Phe Glu She He Leu Asp Glu Phe Gin Asn His Kie 
2275 2280 2285 

Gly Asn Leu Gly Leu Asn Phe Glu Phe Thr Ser Arg Thr Arg Asp Lys 
2290 2295 2300 

Ser Glu hsii Trp Phe Met Ser Iter Arg Gly He Lys Tyr Glu Gly He 
2305 2310 2315 2320 

Tyr He Pro Lys Leu Glu Lys Glu Arg He Val Ala He Leu Glu Trp 

2325 2330 2335 

Asp Arg Ser Asn Leu Pro Glu His Arg Leu Glu Ala He Cys Ala Ala 

2340 2345 2350 

Met Veil Glu Ala Trp Gly Tyr Ser Asp Leu Val His Glu He Arg Lys 
2355 2360 2365 

Phe Tyr Ala Trp Leu Leu Glu Met Gin Pro Phe Ala Asn Leu Ala Lys 
2370 2375 2380 

Xaa Gly Leu Ala Pro Tyr He Ala Glu Thr Ala Leu Arg Asn Leu Tyr 
2385 2390 2395 2400 
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Leu Gly Ttr Gly He Lys Glu Glu Glu He Glu Lys Tyr Leu Lys Gin 

2405 2410 2415 

Phe He Lys Asp Leu Pro Gly lyr He Glu Asp Tyr Asn Glu Asp Val 

2420 2425 2430 

Ebe His Gin 3er Gly Ibr Val Asp Ala Gly Ala Gin Gly Gly Ser Gly 
2435 2440 2445 

Ser Gin Gly Ihr Ihr Pro Pro Ala Hhr Gly Ser Gly Ala Lys Pro Ala 
2450 2455 2460 

Uir Ser Gly Ala Gly Ser Gly Ser Asp Ttr Gly Ala Gly Itor Gly Val 
2465 2470 2475 2480 

13ir Gly Ser Gin Ala Arg Tbr Gly Ser Gly Ito Gly Itir Gly Ser Gly 

2485 2490 2495 

Ala Hit Gly Gly Gin Ser Gly Ser Gly Ser Gly Ihr Glu Gin Val Asn 

2500 2505 2510 

l!hr Gly Ser Ala Gly Thr Asn Ala Thr Gly Gly Gin Arg Asp Arg Asp 
2515 2520 2525 . 

Val Asp Ala Gly Ser Ihr Gly Lys He Ser Val Pro Lys Leu Lys Ala 
2530 2535 2540 

Met Ser Lys Lys Met Arg Leu Pro Lys Ala Lys Gly Lys Asp Val Leu 
2545 2550 2555 2560 



His Leu Asp Phe Leu Leu Thr Tyr Lys Pro Gin Gin Gin Asp He Ser 

2565 2570 2575 

Asn 'Thr Arg Ala Ihr Lys Glu Glu Phe Asp Arg Tip Tyr Aso Ala He 

2580 2585 2590 

Lys Lys Glu Tyr Glu He Asp Asp Thr Gin Met Thr Val Val Met Ser 
2595 2600 2605 



Gly Leu Met Val Ttqp Cys He Glu Asn Gly Cys Ser Pro Asn He Asn 
2610 2615 2620 

Gly Asn Trp Thr Met Met Asp Lys Asp Glu Gin Arg Val Phe Pro Leu 
2625 2630 2635 2640 

Lys Pro Val He Glu Asn Ala Ser Pro Thr Phe Arg Gin He Met His 

2645 2650 2655 

His Phe Ser Asp Ala Ala Glu Ala Tyr He Glu Tyr Arg Asn Ser Thr 

2660 2665 2670 

Glu Arg Tyr Met Pro Arg Tyr Gly Leu Gin Arg Asn Leu Thr Asp TVr 
2675 2680 2685 
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Ser Ala Arg TVr Ala Phe Phe Tyr Glu tet -nn: S& 
2650 2695 2700 

Pro Ala Arg Ala Lys Glu Ala His Met Gin Jfet Lys Ala Ala Ala Val 
2705 2710 2715 2720 

Arg Gly Ser Asn Tte- Arg Leu Phe Gly Lea Asp Gly Asn Val Gly Glu 

2725 2730 2735 

Tte Gin Glu Asn Thr Glu Arg His -nir Ala Gly Asp Val Ser Arg Asn 

2740 2745 2750 

Met His Ser Leu Leu Gly Val Gin Gin His His 
2755 2760 

(2) ZOTDHMAIICN FOR SSQ ID N0:3: 

(i) SEQUENCE OaRACinilSTICS; 
(A) LEN3IH: 20 base pairs 
(3) TYPE: nucleic acid 

(C) SraflNDEIKESS: single 

(D) TOPOLOGY: li 



(ii) MXSCOrs 'TXES: other nucleic acid 

(A) DESraiPIIGN: /desc = -first Adh internal control 



(iii) HVPOTEJEnaL; NO 



(xi) SBQDSNZ2 OESCaiPnCN: SBQ ID 100:3: 
TGCATGTCGG TICTCTIGCA 
(2) INFtDRMAnCN PGR SBQ ID N0:4: 

(i) SSQUENCS C-JARACESUOTICS: 

(A) 20 base pairs 

(B) TTPZ: nucleic acid 
(C; SnWNDEIKESS: single 
(D) TCPOLOGY: lini^r 

(ii) MOLHTJLS TYPE; other nucleic acid 

(A) DESaXPTICa^: /desc = "second Adh internal control 

pruner" 
(iii) HYPCTHETICAL; NO 



(xi) SSCUEJOCE DESCRIPnCN: SBQ ID N0:4: 
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CTCAGCAACSrr ACCTAGACCA 



20 



(2) INPOPMATICN FOR 



ID ND:5: 



(i) SBQUED9:X OIAHAC^^SnCS: 

(A) LEIXjflH: 19 base pairs 

(B) TOTE; nucleic acid 

(C) gPRAMnrrrtsTRgg - single 

(D) TOPOLOGy: linear 

(ii) HDLBdJIiS TOTE: other nucleic acid 

(A) DESC^^IPncs^: /desc = "first synthetic BKT gme 

primer" 
(iii) HYPOIHETICAL: NO 



(xi) SBQUrHIS DEESCHIPnCN: SBQ ID N0:5: 
TGTCrCCGGA GAGGAGACC 19 
(2) INPQRMAnaN FOR SBQ ID ND:6: 

(i) SBQUEtJCE CHARACiraiSnCS; 

(A) licXISIH: 20 base psuLrs 

(B) TSrPE: nucleic acid 

(C) STTIANDSCNESS : single 

(D) TOPOLOGY: linear 

(ii) MDLSCULS TYPE: other nucleic acid 

(A) DESCRIsnCN: /desc = "second synthetic PAT gene 

primer" 
(iii) PaiHLTlCAL : NO 



(xi) SBQUa^CS DESCHIPTTCSN: SBQ ID N0:6: 
(rCAACATClVr GCCATCCACC 20 
(2) mPORMATTCN FOR SBQ ID N0:7: 

(i) SEQ-uENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDHQ^S: single 

(D) TOPOLOGV: linear 



(ii) M3LSCULE TYPE; other nucleic acid 

(A) DESCRIPTTCN: /desc = "first NIa proteinase gene 

primer" 
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(iii) HYPOIHEncaL: NO 



(xi) SECOENCS I3ESC3UPnCN: SBQ ID ND:7: 
GCGOGATCCA TQQG3AAGAA CAAflCSaGT 1GA 
(2) INFORMAnCN FOR SSQ ID ND;8: 

(i) OffiRACTRISnCS: 

(A) L£XI?IH: 30 base pedzs 

(B) TVPS: nucleic acid 

(C) STOAMDEIMESS: single 

(D) TOPQLOGy: l-it.oi^r 
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(ii) yK)TECUlZ TTOE: other nucleic acid 

(A) DESCRIPTrCN: /desc s "second NIa 



(iii) HYPOflHEnCAL: NO 



(xi) SBQUENZE DESCRIPrTCN: S3 
GCGCSAGCTCT TaCICITCAA aSCTOGOCSTC 



ID N0:8: 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule I3bis) 



A. The indications made below relate to the mtcroorgsnssm ref 
on paee 10 , line 34 


erred to in tbe description 

* 


B. IDENTIFICATnON OF DEPOSIT 


Ftmber deposits are ideatifled on an additiooal siieet | j 


Namcofdeposiury instimtion Agrlcnltnral Research Service Cttltare Collect ion 

(HRRZ.) 


Address of depositary institution (iMdudimf postmi code ami cawtfiyj 

1815 Horth xmivera 
Peoria, XL 61604 
USA 


tty Street 


Date of deposit 

29 June 1995 (29.06.95) 


Accession Number 
SRRL 3*21479 


C. ADDITIONAL INDICATIONS (l<a^ biadc ifw appUeabk^ Hits infonnation is continued on an additional sheet fl 





ife request the Srpert Solution where available 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (tf the iiuilcaiums are neifott^UiiaipiMitd States) 



E. SEPARATE FURNISHING OF INDICATIONS (lar^ blank if not cppliatble) 



TTjcindicatsonsiistcdbdowwiiibesubmitxedioibcimernaiional Bureau iatcr(x^^iteyoiffa//^ 'A 
Htunba- of Deposit') 



For receiving OGGce use only 



This sheet was received with the intemaiional appiiation 



AuthorizBd ofXj 



R.L.fl. Pether 



For intemaitanal Bureau use only 



rn ^is sheet was received by the International Bureau on: 



Authorized officer 



Fora) PCT/R0n34 (July 1992) 
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What is darm d is: 

1 . A chimeric gene comprising a monocotyiedonous plant promoter operably linked to a 
nucleotide sequence derived from the genomic sequence of a virus infecting 
monocotyiedoneous plants, wherein said nucleotide sequence contains a modification 
rendering a messenger RNA transcribed from said nucleotide sequence incapable of 
complete translation. 

2. The chimeric gene of claim 1 wherein said virus is selected from the group consisting of 
a potyvtrus, a luteovirus, a tenuiivinis, a carmovirus, a machlovirus, a geminivirus and a 
reovirus, 

3. The chimeric gene of ciaim 2 wherein said virus is a potyvirus. 

4. A chimeric gene comprising a monocotyiedonous plant promoter operably linked to a 
nucleotide sequence derived from the genomic sequence of a maize dwarf mosaic 
viots, wherein said nucleotide sequence contains a modification rendering a messenger 
RNA transcribed from said nucleotide sequence incapable of complete translation. 

5. The chimeric gene of claim 4 wherein said virus is maize dwarf mosaic virus strain B. 

6. The chimeric gene of claim 4 wherein said transcribed RNA is capable of translating an 
attenuated peptide of a maize dwarf mosaic virus protein. 

7. The chimeric gene of ciaim 6 wherein said attenuated peptide is less than 20 amino 
acids in length. 

8. The chimeric gene of claim 4 wherein said transcribed RNA cannot be translated. 

9. The chimeric gene of ciaim 4 wherein said transcribed RNA seouence does not include 
the translation initiation codon of said maize dwarf mosaic virus, strain B. 

1 0. The chimeric gene of claim 4 wherein said transcnbed RNA sequence encodes a 
portion of a viral protein selecteo from the group consisting of a coat protein, a 
proteinase, a repiicase, a helicase. a Vpg protein, a 6K protein and a helper 
comoonent. 
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1 1 . The chimeric gene of claim 4 wherein said modification comprises addition of a 
premature stop codon into said transcribed RNA. 

12. The chimeric gene of daim 4 wherein expression of said gene in transgenic maize, 
sorghum or sugarcane inhibits infection of said transgenic plants by maize dwarf 
mosaic virus. 

13. The chimeric gene of daim 12 wherein expression of said gene in transgenic maize 
inhibits infection of the transgenic plants by maize dwarf mosaic vims. 

14. The chimeric gene of daim 5 wherein said transcribed RNA comprises nucleotides 
4452 to 5744 of SEQ ID No. 1 and said modification comprises the substitution of a T 
for the A at Dosition 4470 of SEQ ID No. 1 . 

15. The chimeric gene of daim 14 wherein said modification further comprises the insertion 
of an ATG codon immediately before the G at position 4452 of SEQ ID No. 1 . 

16. The chimeric gene of daim 4 wherein said monocotyledonous plant promoter is 
selected from the group consisting of a maize ubiquitin promoter, a maize actin 
promoter and a maize phosphoenolpyruvate carboxylase promoter. 

17. A method for producing a monocotyledonous plant with an inheritable trait of resistance 
to infection by a maize dwarf mosaic virus comprising transforming said plant with a 
chimeric gene according to claim 4. 

1 8. A monocotyledonous plant having an inheritable trait of resistance to infection by a 
maize dwarf mosaic virus, wherein said plant comprises a chimeric gene according to 
claim 4. 

19. A chimeric gene comprising a plant promoter operably linked to a nucleotide sequence 
derived from the genomic sequence of maize dwarf mosaic virus strain B encoding a 
viral protein other than a coat protein, wherein transgenic expression of said chimeric 
gene in a plant inhibits infection of said plant with said virus. 

20. The chimeric gene according to claim 1 9 wherein said viral protein is selected from the 
group consisting of RNA dependent RNA polymerase (RdRp) having the amino add 
sequence from position 1915 to 2435 of SEQ ID No. 2. NIa proteinase having the 
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amino acid sequence from position 1484 to 1914 of SEQ ID No. 2, heiicase having the 
amino acid sequence from position 792 to 1430 of SEQ ID No. 2, and P3 proteinase 
having the amino actd sequence from position 378 to 791 of SEQ 10 No. 2. 

21 . The chimeric gene of daim 20 wherein said viral protein is a replicase. 

22. The chimeric gene of ciaim 20 wherein said plant promoter is selected from the group 
consisting of a plant ubiquitin gene promoter, a plant actin gene promoter, and a plant 
pith-preferred promoter. 

23. A method for producing a plant with an inheritable trait of resistance to infection by 
maize dwari mosaic virus strain B comprising transforming said plant with the chimeric 
gene of claim 1 9. 

24. A plant comprising the chimeric gene of daim 22. 

25. A method for protecting progeny of a monocotyiedoneous parent plant from viraJ 
infection comprising transforming said parent plant with a chimeric gene according to 
datm 1 and obtaining progeny plants or breeding said parent plant with a plant 
according to claim 18. 

26. A method according to daim 25, wherein said progeny are protected from infection with 
maize dwarf mosaic virus. 

27. A method according to daim 25, wherein the progeny of marze, sorghum or sugarcane 
plants are protected from viral infection. 
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